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METHOD FOR IDENTIFYING GENES ENCODING NOVEL 
SECRETED OR MEMBRANE-ASSOCIATED PROTEINS 

Background of the Invention 
5 The invention relates to methods for identifying 

genes encoding novel proteins/ 

There is considerable medical interest in secreted 
and membrane-associated mammalian proteins. Many such 
proteins, for example, cytokines, are important for 
10 inducing the growth or differentiation of cells with 
which they interact or for triggering one or more 
specific cellular responses. 

An important goal in the design and development of 
new therapies is the identification and characterization 
15 of secreted proteins and the genes which encode them. 

Traditionally, this goal has been pursued by identifying 
a particular response of a particular cell type and 
attempting to isolate and purify a secreted protein 
capable of eliciting the response. This approach is 
20 limited by a number of factors. First, certain secreted 
proteins will not be identified because the responses 
they evoke may not be recognizable or measurable. 
Second, because in vitro assays must be used to isolate 
and purify secreted proteins, somewhat artificial systems 
25 must be used. This raises the possibility that certain 
important secreted proteins will not be identified unless 
the features of the in vitro system (e.g., cell line, 
culture medium, or growth conditions) accurately reflect 
the in vivo milieu. Third, the complexity of the effects 
30 of secreted proteins on the cells with which they 
interact vastly complicates the task of isolating 
important secreted proteins. Any given cell can be 
simultaneously subject to the effects of two or more 
secreted proteins. Because any two secreted proteins 
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will not have the same effect on a given cell and because 
the effect of a first secreted protein on a given cell 
can alter the effect of a second secreted protein on the 
same cell, it can be difficult to isolate the secreted 
5 protein or proteins responsible for a given physiological 
response. In addition, certain secreted and membrane- 
associated proteins may be expressed at levels that are 
too low to detect by biological assay or protein 
purification. 

10 In another approach, genes encoding secreted 

proteins have been isolated using DNA probes or PCR 
oligonucleotides which recognize sequence motifs present 
in genes encoding known secreted protein- In addition, 
homology-directed searching of Expressed Sequence Tag 

15 (EST) sequences derived by high-throughput sequencing of 
specific cDNA libraries has been used to identify genes 
encoding secreted proteins. These approaches depend for 
their success on a high degree of similarity between the 
DNA sequences used as probes and the unknown genes or EST 

2 0 sequences. 

More recently, methods have been developed that 
permit the identification of cDNAs encoding a signal 
sequence capable of directing the secretion of a 
particular protein from certain cell types. Both Hon jo, 
25 U.S. Patent No. 5,525,486, and Jacobs, U.S. Patent No. 

5,536,637, describe such methods. These methods are said 
to be capable of identifying secreted proteins. 

The demonstrated clinical utility of several 
secreted proteins in the treatment of human disease, for 

3 0 example, erythropoietin, granulocyte-macrophage colony 

stimulating factor (GM-CSF) , human growth hormone, and 
various interleukins , has generated considerable interest 
in the identification of novel secreted proteins- The 
method of the invention can be employed as a tool in the 
35 discovery of such novel proteins. 
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Summary of the Invention 
The invention features a method for isolating 
cDNAs and identifying encode secreted or membrane- 
associated (e.g. transmembrane) mammalian proteins. The 
5 method of the invention relies upon the observation that 
the majority of secreted and membrane-associated proteins 
possess at their amino termini a stretch of hydrophobic 
amino acid residues referred to as the "signal sequence." 
The signal sequence directs secreted and membrane- 
10 associated proteins to a sub-cellular membrane 

compartment termed the endoplasmic reticulum, from which 
these proteins are dispatched for secretion or 
presentation on the cell surface. 

The invention describes a method in which oDNAs 
15 that encode signal sequences for secreted or membrane- 
associated proteins are isolated by virtue of their 
abilities to direct the export of the reporter protein, 
alkaline phosphatase (AP) , from mammalian cells. The 
present method has major advantages over other signal 
2 0 peptide trapping approaches. The present method is 
highly sensitive. This facilitates the isolation of 
signal peptide associated proteins that may be difficult 
to isolate with other techniques. Moreover, the present 
method is amenable to throughput screening techniques and 
25 automation. Combined with a novel method for cDNA 

library construction in which directional random primed 
cDNA libraries are prepared, the invention comprises a 
powerful and approach to the large scale isolation of 
novel secreted proteins. 
30 The invention features a method for identifying a 

CDNA nucleic acid encoding a mammalian protein having a 
signal sequence, which method includes the following 
steps : 

a) providing library of mammalian cDNA; 
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b) ligating the library of mammalian cDNA to DNA 
encoding alkaline phosphatase lacking both a signal 
sequence and a membrane anchor sequence to form ligated 
DNA; 

5 c) transforming bacterial cells with the ligated 

DNA to create a bacterial cell clone library; 

d) .isolating DNA comprising the mammalian cDNA 
from at least one clone in the bacterial cell clone 
library; 

10 e) separately transfecting DNA isolated from 

clones in step (d) into mammalian cells which do not 
express alkaline phosphatase to create a mammalian cell 
clone library wherein each clone in the mammalian cell 
clone library corresponds to a clone in the bacterial 

15 cell clone library; 

f) identifying a clone in the mammalian cell clone 
library which express alkaline phosphatase; 

g) identifying the clone in the bacterial cell 
clone library corresponding to the clone in the mammalian 

20 cell clone library identified in step (f ) ; and 

h) isolating and sequencing a portion of the 
mammalian cDNA present in the bacterial cell library 
clone identified in step (g) to identify a mammalian cDNA 
encoding a mammalian protein having a signal sequence. 

2 5 A cDNA library is a collection of nucelic acid 

molecueles that are a cDNA copy of a sample of mRNA. 

In another aspect, the invention features ptrAP3 
expression vector. 

In another aspect, the invention features a 

3 0 substantially pure preparation of ethb0018f2 protein. 

Preferably, the ethb0018f2 protein includes an amino acid 
sequence substantially identical to the amino acid 
sequence shown in FIG. 5 (SEQ ID NO: 5); is derived from 
a mammal, for example, a human. 
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The invention also features purified DNA (for 
example, cDNA) which includes a sequence encoding a 
ethb0018f2 protein, preferably encoding a human 
ethb0018f2 protein (for example, the ethb0018f2 protein 
5 of FIG. 5; SEQ ID NO: 5); a vector and a cell which 

includes a purified DNA of the invention; and a method of 
producing a recombinant ethbOOlBf 2 protein involving 
providing a cell transformed with DNA encoding ethb0018f2 
protein positioned for expression in the cell, culturing 
10 the transformed cell under conditions^ for expressing the 
DNA, and isolating the recombinant ethb0018f2 protein. 
The invention further features recombinant ethb0018f2 
protein produced by such expression of a purified DNA of 
the invention. 

15 By "ethb0018f2 protein" is meant a polypeptide 

which has a biological activity possesed by naturally- 
occuring ethb0018f2 protein- Preferably, such a 
polypeptide has an amino acid sequence which is at least 
85%, preferably 90%, and most preferably 95% or even 99% 

20 identical to the amino acid sequence of the ethb0018f2 
protein of FIG. 5 (SEQ ID NO: 5) . 

By "substantially identical" is meant a 
polypeptide or nucleic acid having a sequence that is at 
least 85%, preferably 90%, and more preferably 95% or 

25 more identical to the sequence of the reference amino 
acid or nucleic acid sequence. For polypeptides, the 
length of the reference polypeptide sequence will 
generally be at least 16 amino acids, preferably at least 
2 0 amino acids, more preferably at least 2 5 amino acids, 

30 and most preferably 35 amino acids. For nucleic acids, 
the length of the reference nucleic acid sequence will 
generally be at least 50 nucleotides, preferably at least 
60 nucleotides, more preferably at least 75 nucleotides, 
and most preferably 110 nucleotides. 
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Sequence identity can be measured using sequence 
analysis software (e.g.. Sequence Analysis Software 
Package of the Genetics Computer Group, University of 
Wisconsin Biotechnology Center, 1710 University Avenue, 
5 Madison, WI 53705) - 

In the case of polypeptide sequences which are 
less than 100% identical to a reference sequence, the 
non-identical positions are preferably, but not 
necessarily, conservative substitutions for the reference 

10 sequence. Conservative substitutions typically include 
substitutions within the following groups: glycine and 
alanine; valine, isoleucine, and leucine; aspartic acid 
and glutamic acid; asparagine and glutamine; serine and 
threonine; lysine and arginine; and phenylalanine and 

15 tyrosine. 

Where a particular polypeptide is the to have a 
specific percent identity to a reference polypeptide of a 
defined length, the percent identity is relative to the 
reference peptide. Thus, a peptide that is 50% identical 

2 0 to a reference polypeptide that is 100 amino acids long 
can be a 50 amino acid polypeptide that is completely 
identical to a 50 amino acid long portion of the 
reference polypeptide. It might also be a 100 amino acid 
long polypeptide which is 50% identical to the reference 

2 5 polypeptide over its entire length. Of course, many 

other polypeptides will meet the same criteria. 

By "protein" and "polypeptide" is meant any chain 
of amino acids, regardless of length or post- 
translational modification (e.g., glycosylation or 

3 0 phosphorylation) . 

By "substantially pure" is meant a preparation 
which is at least 60% by weight (dry weight) the compound 
of interest, i.e., a ethb0018f2 protein. Preferably the 
preparation is at least 75%, more preferably at least 
35 90%, and most preferably at least 99%, by weight the 
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compound of interest. Purity can be measured by any 
appropriate method, e.g. , column chromatography, 
polyacrylamide gel electrophoresis, or HPLC analysis - 
By "purified DNA" is meant DNA that is not 
5 immediately contiguous with both of the coding sequences 
with which it is immediately contiguous (one on the 5' 
end and one on the 3' end) in the naturally occurring 
genome of the organism from which it is derived. The 
term therefore includes, for example, a recombinant DNA 

10 which is incorporated into a vector; into an autonomously 
replicating plasmid or virus; or into the genomic DNA of 
a prokaryote or eukaryote, or which exists as a separate 
molecule (e.g., a cDNA or a genomic DNA fragment produced 
by PGR or restriction endonuclease treatment) independent 

15 of other sequences. It also includes a recombinant DNA 
which is part of a hybrid gene encoding additional 
polypeptide sequence. 

By "substantially identical" is meant an amino 
acid sequence which differs only by conservative amino 

20 acid substitutions, for example, substitution of one 
amino acid for another of the same class (e.g., valine 
for glycine, arginine for lysine, etc.) or by one or more 
non-conservative substitutions, deletions, or insertions 
located at positions of the amino acid sequence which do 

25 not destroy the function of the protein (assayed, e.g., 
as described herein) . Preferably, such a sequence is at 
least 85%, more preferably 90%, and most preferably 95% 
identical at the amino acid level to the sequence of FIG. 
5 (SEQ ID NO: 5) . For nucleic acids, the length of 

3 0 comparison sequences will generally be at least 50 

nucleotides, preferably at least 60 nucleotides, more 
preferably at least 75 nucleotides, and most preferably 
110 nucleotides. A "substantially identical" nucleic 
acid sequence codes for a substantially identical amino 

3 5 acid sequence as defined above. 
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By "transformed cell" is meant a cell into which 
(or into an ancestor of which) has been introduced, by 
means of recombinant DNA techniques, a DNA molecule 
encoding (as used herein) ethb0018f2 protein* 
5 By "positioned for expression" is meant that the 

DNA molecule is positioned adjacent to a DNA sequence 
which directs transcription and translation of the 
sequence (i.e., facilitates the production of ethb0018f2 
protein) . 

10 By "purified antibody" is meant antibody which is 

at least 60%, by weight, free from the proteins and 
naturally-occurring organic molecules with which it is 
naturally associated. Preferably, the preparation is at 
least 75%, more preferably at least 90%, and most 

15 preferably at least 99%, by weight, antibody. 

By "specifically binds" is meant an antibody which 
recognizes and binds ethb0018f2 protein but which does 
not substantially recognize and bind other molecules in a 
sample, e.g., a biological sample, which naturally 

2 0 includes ethb0018f2 protein. 

Unless otherwise defined, all technical and 
scientific terms used herein have the same meaning as 
commonly understood by one of ordinary skill in the art 
to which this invention belongs. Although methods and 
' 25 materials similar or equivalent to those described herein 
can be used in the practice or testing of the present 
invention, the preferred methods and materials are 
described below. All publications, patent applications, 
patents, and other references mentioned herein are 

3 0 incorporated by reference in their entirety. In case of 

conflict, the present specification, including 
definitions, will control. In addition, the materials, 
methods, and examples are illustrative only and not 
intended to be limiting. 
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Other features and advantages of the invention 
will be apparent from the following detailed description, 
and from the claims. 

Brief Description o f the Drawings 
5 Figure 1 is a schematic drawing of a portion of 

the ptrAP3 vector. 

Figure 2 is a representation of the DNA sequence 

of the ptrAP3 vector (SEQ ID NO:l). The bold, underlined 

portion is the small fragment removed prior to cDNA 
10 insertion sequence. The italic, underlined portion is 

the alkaline phosphatase sequence. 

Figure 3 is a representation of the amino acid 

sequence of human placental alkaline phosphatase 

(Accession No. P05187) . The underlined portion is the 
15 signal sequence. The bold, underlined portion is the 

membrane anchor sequence. 

Figure 4 is a representation of the amino acid 

sequence of the alkaline phosphatase encoded by ptrAP3 . 

Figure 5 is a representation of the cDNA and amino 
20 acid sequence of a portion of a novel secreted protein 

identified using the method described in Example 1. 

Figure 6 is a representation of an alignment of 

the amino acid sequence of clone ethb0018f2 (referred to 

here as 8f2) and proteins containing conserved IgG 
25 domains. The proteins are D3 8492 (neural adhesion 

molecule f3) ; P20241EURO (Drosophila Neuroglian) ; 

P32004EURA (human neural adhesion molecule LI) ; P35331G- 

CA (chick neural adhesion molecule related protein) ; 

Q02246XONI (human Axonin 1); U11031 (rat neural adhesion 
30 molecule BIGl) ; and X65224 (chicken Neurofascin) are 

depicted. In this figure, conserved motifs within the 

IgG domain are highlighted in bold. 
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Detailed Description 
In general terms, the method of the invention 
entails the following steps: 

!• Preparation of a randomly primed cDNA library 
5 using cDNA prepared from mRNA extracted from mammalian 
cells or tissue. The cDNA is inserted into a mammalian 
expression vector adjacent to a cDNA encoding placental 
alkaline phosphatase which lacks a secretory signal. 

2. Amplification of the cDNA library in bacteria. 
10 3. Isolation of the cDNA library. 

4 - Transf ection of the resulting cDNA library 
into mammalian cells. 

5 , Assay of supernatants from the transf ected 
mammalian cells for alkaline phosphatase activity. 
15 6. Isolation and sequencing of plasmid DNA clones 

registering a positive score in the alkaline phosphatase 
assay. 

7 . Isolation of full length cDNA clones of novel 
proteins having a signal sequence. 
2 0 The mammalian cDNA used to create the cDNA library 

can be prepared using any known method. Generally, the 
cDNA is produced from mRNA. The mRNA can be isolated 
from any desired tissue or cell type. For example, 
peripheral blood cells, primary cells, tumor cells, or 
25 other cells may be used as a source of mRNA. 

The expression vector harboring the modified 
alkaline phosphatase gene can be any vector suitable for 
expression of proteins in mammalian cells. 

The mammalian cells used in the transfection step 
30 can be any suitable mammalian cells, e.g., CHO cells, 
mouse L cells, Hela cells, VERO cells, mouse 3T3 cells, 
and 293 cells. 

Described below is a specific example of the 
method of the invention. Also described below are two 
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genes/ one known and one novel, identified using this 
method • 

Exam ple I 

Step 1 Generation of Mammalian Signal Peptide Trap cDNA 
5 Libraries 
Vector 

A cDNA library was prepared using ptrAP3, a 
mammalian expression vector containing a cDNA encoding 
human placental alkaline phosphatase (AP) lacking a 

10 signal sequence (FIG. 1 and FIG. 2, SEQ ID NO:l). When 
ptrAP3 is transfected into a mammalian cell line, such as 
COS7 cells, AP protein is neither expressed nor secreted 
since the AP cDNA of ptraAP3 does not encode a 
translation initiating methionine, a signal peptide, or a 

15 membrane anchor sequence. FIG. 3 (SEQ ID NO: 2) provides 
the amino acid sequence of naturally occurring AP. FIG. 
4 (SEQ ID NO: 3) provides the amino acid sequence of the 
form of AP encoded by ptrAP3 . However, insertion of a 
CDNA encoding a signal peptide sequence into ptrAP3 such 

2 0 that the signal sequence within the cDNA is fused to and 
in frame with AP, facilities both the expression and 
secretion of AP protein upon transfection of the DNA into 
COS7 cells or other mammalian cells. The presence of AP 
activity in the supernatants of transfected COS7 cells 

2 5 therefore indicates the presence of a signal sequence in 
the cDNA of interest. 

cDNA Synthesis and liiaation 

cDNA for ligation to the ptrAP3 vector was 
prepared from messenger RNA isolated from human fetal 
30 brain tissue (Clontech, Palo Alto, CA: Catalog #6525-1) 
by a modification of a commercially available "ZAP cDNA 
synthesis kit" (Stratagene; La Jolla, CA: Catalog # 
200401). Synthesis of cDNA involved the following steps - 
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(a) Single stranded cDNA was synthesized from 5 fig 
of human fetal brain messenger RNA using a random hexamer 
primer incorporating a Xhol restriction site 
(underlined); 5 ' -CTGACTCGAGNNNNNN-3 ' (SEQ ID NO:4). This 

5 represented a deviation from the Stratagene protocol and 
resulted in a population of randomly primed cDNA 
molecules . Random priming was employed rather than the 
oligo d(T) priming method suggested by Stratagene in 
order to generate short cDNA fragments, some of which 
10 would be expected to be mRNAs that encode signal 
sequences . 

(b) The single stranded cDNA generated in step (a) 
was rendered double stranded, and DNA linkers containing 
a free EcoRl overhang were ligated to both ends of the 

15 double stranded cDNAs using reagents and protocols from 
the Stratagene ZAP cDNA synthesis kit according to the 
manufacturer ' s instructions . 

(c) The linker-adapted double-stranded cDNA 
generated in step (b) was digested with Xhol to generate 

2 0 a free Xhol overhang at the 3' end of the cDNAs using 

reagents from the Stratagene ZAP cDNA synthesis kit 
according to the manufacturers instructions. 

(d) Linker-adapted double-stranded cDNAs were size 
selected by gel filtration through SEPHACRYL™ S-500 cDNA 

25 Size Fractionation Columns (Gibco BRL; Bethesda, MD: 
Catalog #18092-015) according to the manufacturers 
instructions . 

(e) Size selected, double-stranded cDNAs 
containing a free EcoRl overhang at the 5' end and a free 

3 0 Xhol overhang at the 3' end were ligated to the ptrAP3 

backbone which had been digested with EcoRl and Xhol and 
purified from the small, released fragment by agarose gel 
electrophoresis . 

(f ) Ligated plasmid DNAs were transformed into E^ 
3 5 Coli strain DHlOb by electroporation. 
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This process resulted in a library of cDNA clones 
composed of several million random primed cDNAs (some of 
which will encode signal sequences) prepared from human 
fetal brain messenger RNA, fused to the AP reporter cDNA, 
5 in the mammalian expression vector ptrAP3. 

Step 2 Plating and Automated Picking of Bacterial 
Colonies 

Next, the transformed bacterial cells were plated, 
and individual clones were identified. A sample of 

10 transformed E. coli containing the random primed human 
fetal brain cDNA library described in Step 1 was plated 
for growth as individual colonies, using standard 
procedures. Each E. coli colony contained an individual 
cDNA clone fused to the AP reporter in the ptrAPS 

15 expression vector. Approximately 20,000 such E. coli 

colonies were plated, representing approximately 0.5% of 
the total cDNA library. 

Next, E. coli colonies were picked from the plates 
and inoculated into deep well 96 well plates containing 1 

2 0 ml of growth medium prepared by standard procedures. 

Colonies were picked from the plates and E. coli cultures 
were grown overnight by standard procedures. Each plate 
was identified by number. Within each plate, each well 
contained an individual cDNA clone in the ptrAP vector 

2 5 identified by well position. 

Finally, plasmid DNA was extracted from the 
overnight E. coli cultures using a semi-automated 96-well 
plasmid DNA miniprep procedure, employing standard 
procedures for bacterial lysis, genomic DNA precipitation 

3 0 and plasmid DNA purification. 

The plasmid DNA extraction was performed as 
follows: 

(a) E. coli were centrifuged for 20 minutes using 
a Beckman Centrifuge at 3200 rpm. 


SDOCID: <WO 9B22491A1_L> 


wo 98/22491 


PCT/US97/20201 


- 14 - 

(b) Supernatant was discarded and E. coli pellets 
were resuspended in 130 /il WPl (50 mM TRIS (pH 7.5), 10 
mM EDTA, 100 /xg/ml RNase A) resuspension solution using a 
TITERTECK MULTIDROP™ apparatus. 
5 (c) E. coli pellets were resuspended by vortexing. 

(d) 13 0 All WP2 (0.2 M NaOH, 0.5% SDS) lysing 
solution was added to each well, and the samples were 
mixed by vortexing for 5 seconds. 

(e) 13 0 /il WP3 (125 mM potassium acetate, pH 4.8) 
10 neutralizing solution was added to each well, and the 

samples were mixed by vortexing for 5 seconds. 

(f) Samples were placed on ice for 15 minutes, 
mixed by vortexing for 5 seconds, and recentrif uged for 
10 minutes at 32 00 rpm in a Beckman Centrifuge. 

15 (g) Supernatant (crude DNA extract) was 

transferred from each well of each 96 well plate into a 
9 6 well filter plate (Polyf iltronics) using a 
TOMTEC/ Quadra 96^" transfer apparatus. 

(h) 480 ptl of Wizard™ Midiprep DNA Purification 
2 0 Resin (Promega) was added to each well of each plate 

containing crude DNA extract using a Titertek Multidrop 
apparatus and the samples were left for 5 minutes. 

(i) Each 96 well filter plate was placed on a 
vacuum housing (Polyf iltronics) and the liquid in each 

2 5 well was removed by suction generated by vacuum created 

with a Lab Port Vacuum pump. 

(j) The Wizard Midiprep DNA Purification Resin in 
each well (to which plasmid DNA was bound) was washed 
four times with 600 /il of Wizard Wash™. 

3 0 (k) Plates were centrifuged for 5 minutes to 

remove excessive moisture from the Wizard Midiprep DNA 
Purification Resin. 

(1) Purified plasmid DNAs were eluted from the 
Wizard Midiprep DNA Purification Resin into collection 
3 5 plates by addition of 50 /il deionized water to each well 
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using a Multidrop 8 Channel Pipette, incubation at room 
temperature for 15 minutes, and centrif ugation for 5 
minutes (32 00 rpm, Beckman centrifuge) . 

This process resulted in preparation of plasmid 
5 DNA contained in 9 6 well plates with each well containing 
an individual cDNA clone ligated in the ptrAP expression 
vector- Individual clones were identified by plate 
number and well position. 

Step 4 Transfection of DNAs into COS7 cells 
10 To determine which of the cDNA clones contained 

within the cDNA library encoded functional signal 

peptides, individual plasmid DNA preparations were 

transfected into COS7 cells as follows. 

For each 96 well plate of DNA preparations, one 9 6 
15 well tissue culture plate containing approximately 10,000 

COS7 cells per well was prepared using standard 

procedures • 

Immediately prior to DNA transfection, the COST 
cell culture medium in each well of each 9 6 well plate 
20 was replaced with 80 ul of OptiMEM (Gibco-BRL; catalog , 
#31985-021) containing 1 Ml of lipof ectamine (Gibco-BRL) 
and 2 Ml (approximately 100-200 ng) of DNA prepared as 
described above. Thus, each well of each 96 well plate 
containing COS? cells received DNA representing one 

2 5 individual cDNA clone from the cDNA library in ptrAP3. 

The COS? cells were incubated with the Opti- 
MEM/Lipofectamine/DNA mixture overnight to allow 
transfection of cells with the plasmid DNAs. 

After overnight incubation, the transfection 

3 0 medium was removed from the cells and replaced with 80 Ml 

fresh medium composed of Opti-MEM + 1% fetal calf serum. 
Cells were incubated overnight. 
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Step 5 Alkaline Phospha-fcase Assay 

The secreted alkaline phosphatase activity of the 
transfected C0S7 cells was measured as follows- Samples 
(10 Ml) of supernatants from the transfected COS7 cells 
5 were transferred from each well of each 96 well plate 
into one well of a Microfluor scintillation plate 
(Dynatech:I*ocation Catalog #011-010-7805) • AP activity 
in the supernatants was determined using the Phospha- 
Light Kit (Tropix Inc. ; catalog #BP3 00) . AP assays were 
10 performed according to the manufacturer's instruction 
using a Wallace Micro-Beta scintillation counter. 

Step 6 Sequencing and Analysis of Positive Clones 

The individual plasmid DNAs scoring positive in 
the COS7 cell AP secretion assay were analyzed further by 

15 DNA sequencing using standard procedures. The resulting 
DNA sequence information was used to perform BLAST 
sequence similarity searches of nucleotide protein 
databases to ascertain whether the clone in question 
encodes either 1) a known secreted or membrane-associated 

20 protein possessing a signal sequence, or 2) a putative 

novel, secreted or membrane-associated protein possessing 
a putative novel signal sequence. 

Identification of the Protein Tyrosine Phosphatase Sigma 
(PTPg) Signal Seguence by Mammalian Signal Peptide trAP 

25 Employing the method described in Example 1, a 

cDNA clone designated ethb005c07 was found to score 
positive in the COS7 cell transfection AP assay. BLAST 
similarity searching with the DNA sequence from this 
clone identified ethb005c07 as a cDNA encoding the signal 

30 sequence of protein tyrosine phosphatase sigma (PTPa) , a 
previously described protein that is well established in 
the scientific literature to be a transmembrane protein 


JSCKDCID: <WO 9822491 A1J_> 


wo 98/22491 


PCT/US97/20201 


- 17 - 

(Pulido et al., Proc. Naii^l Acad . Sci. USA 92:11686, 
1995) . 

Iden-bif ication of a Novel Ini Tnunoalobulin Domain 
Con-baininq Pro-bein bv Mammalian Signal Peptide trAP 
5 Employing the method described in Example 1, a 

cDNA clone designated ethb0018f2 was found to score 
positive in the COS7 cell transfection AP assay. DNA 
sequencing revealed that ethb0018f2 harbors a 1455 base 
pair cDNA having a single open reading frame commencing 

10 at nucleotide 55 and continuing to nucleotide 1455. 

Thus, the ethb0018f2 cDNA encodes a 467 amino acid open 
reading frame (FIG. 5, SEQ ID NO: 5) fused to the AP 
reporter. Inspection of the ethb0018f2 protein sequence 
revealed the presence of a putative signal sequence 

15 between amino acids 1 to 20, predicted by the signal 
peptide prediction algorithm, signal P (Von Heijne, 
Nucleic Acids. Reg. 14:4683-90, 1986). Thus, ethb0018f2 
encodes a partial clone of a novel putative 
secreted/membrane protein. BLAST similarity searching of 

2 0 nucleic acid and protein databases with the ethb0018f2 
DNA sequence from this clone revealed similarity to a 
family of proteins known to contain a protein motif 
referred to as an Immunoglobulin of IgG domain. 

Further visual inspection of the ethb0018f2 

25 protein sequence resulted in the identification of 5 

consecutive IgG repeats, defined by a conserved spacing 
of cysteine, tryptophan, tyrosine, and cysteine residues 
(FIG. 5). 

FIG. 6 is a depiction of a protein sequence 
30 alignment between clone ethbOOlSf 2 (referred to as 8f2) 
and seven related proteins known to contain IgG domains 
that are also known to be expressed in the brain. These 
proteins are rat neural adhesion molecule f3 (D38492) , 
Drosophila Neuroglian (P20241) , human neural adhesion 
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molecule LI (P32004) , chick neural adhesion molecule 
related (P35331) , human Axonin 1 (Q02246) , rat neural 
adhesion molecule BIGl (U11031) and chicken Neurofascin 
(X65224) . Given this sequence similarity, it is likely 
5 that clone ethb0018f2 represents a partial cDNA cone 
representing a novel protein, expressed in the brain, 
which contains multiple, consecutive IgG domains. 
Specifically, since the closest relatiaves of clone 
ethb0018f2 are believed to function as neural adhesion 
10 molecules, it is likely that clone ethb0018f2 represents 
a partial cDNA clone of a novel neural adhesion molecule. 

Other Embodiments 
It is to be understood that while the invention 
has been described in conjunction with the detailed 
15 description thereof, that the foregoing description is 
intended to illustrate and not limit the scope of the 
invention, which is defined by the scope of the appended 
claims. 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4951 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 


AAGCTTGGCT 
GCAGAAGTAT 
GCTCCCCAGC 
CGCCCCTAAC 
ATGGCTGACT 
TCCAGAAGTA 
CGAGGGGCTC 
CGGTTGAGTC 
TAGGTAAGTT 
CCTAGACTCA 
GTTTCGTTTT 
GGTAAGTTTA 
ATCTAAGAAC 
CTTCTGCTCT 


GTGGAATGTG 
GCAAAGCATG 
AGGCAGAAGT 
TCCGCCCATC 
AATTTTTTTT 
GTGAGGAGGC 
GCATCTCTCC 
GCGTTCTGCC 
TAAAGCTCAG 
GCCGGCTCTC 
CTGTTCTGCG 
GTCTTTTTGT 
TGCTCCTCAG 
AAAAGCTGCG 


TGTCAGTTAG 
CATCTCAATT 
ATGCAAAGCA 
CCGCCCCTAA 
ATTTATGCAG 
TTTTTTGGAG 
TTCACGCGCC 
GCCTCCCGCC 
GTCGAGACCG 
CACGCTTTGC 
CCGTTACAGA 
CTTTTATTTC 
TGAGTGTTGC 
GAATTCGCAC 


GGTGTGGAAA 
AGTCAGCAAC 
TGCATCTCAA 
CTCCGCCCAG 
AGGCCGAGGC 
GCCTAGGCTT 
CGCCGCCCTA 
TGTGGTGCCT 
GGCCTTTGTC 
CTGACCCTGC 
TCCAAGCTCT 
AGGTCCCAGG 
CTTTACTTCT 
CACCGTAGTT 


GTCCCCAGGC 
CAGGTGTGGA 
TTAGTCAGCA 
TTCCGCCCAT 
CGCCTCGGCC 
TTGCAAAAAG 
CCTGAGGCCG 
CCTGAACTGC 
CGGCGCTCCC 
TTGCTCAACT 
GAAAAACCAG 
TCCCGGATCC 
AGGCCTGTAC 
TTTACGCCCG 


TCCCCAGCAG 
AAGTCCCCAG 
ACCATAGTCC 
TCTCCGCCCC 
TCTGAGCTAT 
CTCCTCCGAT 
CCATCCACGC 
GTCCGCCGTC 
TTGGAGCCTA 
CTACGTCTTT 
AAAGTTAACT 
GGTGATCCAA 
GG7VAGTGTTA 
GTGAGCGCTC 


60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
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CACCCGCACC 
GGCCAACGAG 
GCCGCTGGAC 
GCCCACGCTT 
ACCCACCGTG 
GACCGTGGAG 
GGGACTGGGC 
CACTGCCACA 
AGTTGAGGAG 
CAAGAAGCTG 
GATGGGGGTG 
GGGGCCTGAG 
CAATGTAGAC 
CAAGGGCAAC 
GACACGCGGC 
GGGAGTGGTA 
GGTGAACCGC 
CCAGGACATC 
CCGAAAGTAC 
AGGTGGGACC 
TGCCCGGTAT 
CCATCTCATG 
ACTGGACCCC 
CCGCGGCTTC 
GGCTTACCGG 
GCTCACCAGC 
CTTCGGAGGC 
GGACAGGAAG 
CGGCGCCCGG 
AGCAGTGCCC 
CCCGCAGGCG 
CTTCGCCGCC 
CGACGCCGCG 
ACCTGAAACA 
GTTACAAATA 
CTAGTTGTGG 
GAGCTCGAAT 
GGCTGCGGCG 
GGGATAACGC 
AGGCCGCGTT 
GACGCTCAAG 
CTGGAAGCTC 
CCTTTCTCCC 
CGGTGTAGGT 
GCTGCGCCTT 
CACTGGCAGC 
AGTTCTTGAA 
CTCTGCTGAA 
CCACCGCTGG 
GATCTCAAGA 
CACGTTAAGG 
ATTAAAAATG 
ACCAATGCTT 
TTGCCTGACT 
GTGCTGCAAT 
AGCCAGCCGG 
CTATTAATTG 
TTGTTGCCAT 
GCTCCGGTTC 
TTAGCTCCTT 
TGGTTATGGC 
TGACTGGTGA 
CTTGCCCGGC 
TCATTGGAAA 
GTTCGATGTA 
TTTCTGGGTG 
GGAAATGTTG 
ATTGTCTCAT 
CGCGCACATT 


TACAAGCGCG 
CGCCTCGGGG 
GAGGGCAACC 
GCACCGTCCG 
CAGCTGATGG 
CCTGGGCTGG 
GTGCAGACCG 
GAGGGCATGG 
GAGAACCCGG 
CAGCCTGCAC 
TCTACGGTGA 
ATACCCCTGG 
AAACATGTGC 
TTCCAGACCA 
AACGAGGTCA 
ACCACCACAC 
AACTGGTACT 
GCTACGCAGC 
ATGTTTCGCA 
AGGCTGGACG 
GTGTGGAACC 
GGTCTCTTTG 
TCCCTGATGG 
TTCCTCTTCG 
GCACTGACTG 
GAGGAGGACA 
TACCCCCTGC 
GCCTACACGG 
CCGGATGTTA 
CTGGACGAAG 
CACCTGGTTC 
TGCCTGGAGC 
CACCCGGGTT 
TAAAATGAAT 
AAGCAATAGC 
TTTGTCCAAA 
TAATTCCTCT 
AGCGGTATCA 
AGGAAAGAAC 
GCTGGCGTTT 
TCAGAGGTGG 
CCTCGTGCGC 
TTCGGGAAGC 
CGTTCGCTCC 
ATCCGGTAAC 
AGCCACTGGT 
GTGGTGGCCT 
GCCAGTTACC 
TAGCGGTGGT 
AGATCCTTTG 
GATTTTGGTC 
AAGTTTTAAA 
AATCAGTGAG 
CCCCGTCGTG 
GATACCGCGA 
AAGGGCCGAG 
TTGCCGGGAA 
TGCTACAGGC 
CCAACGATCA 
CGGTCCTCCG 
AGCACTGCAT 
GTACTCAACC 
GTCAATACGG 
ACGTTCTTCG 
ACCCACTCGT 
AGCAAAAACA 
AATACTCATA 
GAGCGGATAC 
TCCCCGAAAA 


TGTATGATGA 
AGTTTGCCTA 
CAACACCTAG 
AAGAAAAGCG 
TACCCAAGCG 
AGCCCGAGGT 
TGGACGTTCA 
AGACACAAAC 
ACTTCTGGAA 
AGACAGCCGC 
CAGCTGCCAG 
CCATGGACCG 
CAGACAGTGG 
TTGGCTTGAG 
TCTCCGTGAT 
GAGTGCAGCA 
CGGACGCCGA 
TCATCTCCAA 
TGGGAACCCC 
GGAAGAATCT 
GCACTGAGCT 
AGCCTGGAGA 
AGATGACAGA 
TGGAGGGTGG 
AGACGATCAT 
CGCTGAGCCT 
GAGGGAGCTC 
TCCTCCTATA 
CCGAGAGCGA 
AGACCCACGC 
ACGGCGTGCA 
CCTACACCGC 
GAACTAGTCT 
GCAATTGTTG 
ATCACAAATT 
CTCATCAATG 
TCCGCTTCCT 
GCTCACTCAA 
ATGTGAGCAA 
TTCCATAGGC 
CGAAACCCGA 
TCTCCTGTTC 
GTGGCGCTTT 
AAGCTGGGCT 
TATCGTCTTG 
AACAGGATTA 
AACTACGGCT 
TTCGGAAAAA 
TTTTTTGTTT 
ATCTTTTCTA 
ATGAGATTAT 
TCAATCTAAA 
GCACCTATCT 
TAGATAACTA 
GACCCACGCT 
CGCAGAAGTG 
GCTAGAGTAA 
ATCGTGGTGT 
AGGCGAGTTA 
ATCGTTGTCA 
AATTCTCTTA 
AAGTCATTCT 
GATAATACCG 
GGGCGAAAAC 
GCACCCAACT 
GGAAGGCAAA 
CTCTTCCTTT 
ATATTTGAAT 
GTGCCACCTG 


20 - 

GGTGTACGGC 
CGGAAAGCGG 
CCTAAAGCCC 
CGGCCTAAAG 
CCAGCGACTG 
CCGCGTGCGG 
GATACCCACC 
GTCCCCGGTT 
CCGCGAGGCA 
CAAGAACCTC 
GATCCTAAAA 
CTTCCCATAT 
AGCCACAGCC 
TGCAGCCGCC 
GAATCGGGCC 
CGCCTCGCCA 
CGTGCCTGCC 
CATGGACATT 
AGACCCTGAG 
GGTGCAGGAA 
CATGCAGGCT 
CATGAAATAC 
GGCTGCCCTG 
TCGCATCGAC 
GTTCGACGAC 
CGTCACTGCC 
CATCTTCGGG 
CGGAAACGGT 
GAGCGGGAGC 
AGGCGAGGAC 
GGAGCAGACC 
CTGCGACCTG 
AGAGAAAAAA 
TTGTTAACTT 
TCACAAATAA 
TATCTTATCA 
CGCTCACTGA 
AGGCGGTAAT 
AAGGCCAGCA 
TCCGCCCCCC 
CAGGACTATA 
CGACCCTGCC 
CTCAATGCTC 
GTGTGCACGA 
AGTCCAACCC 
G GAG AGOG AG 
ACACTAGAAG 
GAGTTGGTAG 
GCAAGCAGCA 
CGGGGTCTGA 
CAAAAAGGAT 
GTATATATGA 
CAGCGATCTG 
CGATACGGGA 
CACCGGCTCC 
GTCCTGCAAC 
GTAGTTCGCC 
CACGCTCGTC 
CATGATCCCC 
GAAGTAAGTT 
CTGTCATGCC 
GAGAATAGTG 
CGCCACATAG 
TCTCAAGGAT 
GATCTTCAGC 
ATGCCGCAAA 
TTCAATATTA 
GTATTTAGAA 
C 


GACGAGGACC 
CATAAGGACA 
GTGACACTGC 
CGCGAGTCTG 
GAAGATGTCT 
CCAATCAAGC 
ACCAGTAGCA 
GCCTAGCTCG 
GCCGAGGCCC 
ATCATCTTCC 
GGGCAGAAGA 
GTGGCTCTGT 
ACGGCCTACC 
CGCTTTAACC 
AAGAAAGCAG 
GCCGGCACCT 
TCGGCCCGCC 
GACGTGATCC 
TACCCAGATG 
TGGCTGGCGA 
TGCCTGGAGC 
GAGATCCACC 
CGCCTGCTGA 
CATGGTCATC 
GCCATTGAGA 
GACCACTCCC 
CTGGCCCCTG 
CCAGGCTATG 
CCCGAGTATC 
GTGGCGGTGT 
TTCATAGCGC 
GCGCCCCCCG 
CCTCCCACAC 
GTTTATTGCA 
AGCATTTTTT 
TGTCTGGATC 
CTCGCTGCGC 
ACGGTTATCC 
AAAGGCCAGG 
TGACGAGCAT 
AAGATACCAG 
GCTTACCGGA 
ACGCTGTAGG 
ACCCCCCGTT 
GGTAAGACAC 
GTATGTAGGC 
GACAGTATTT 
CTCTTGATCC 
GATTACGCGC 
CGCTCAGTGG 
CTTCACCTAG 
GTAAACTTGG 
TCTATTTCGT 
GGGCTTACCA 
AGATTTATCA 
TTTATCCGCC 
AGTTAATAGT 
GTTTGGTATG 
CATGTTGTGC 
GGCCGCAGTG 
ATCCGTAAGA 
TATGCGGCGA 
CAGAACTTTA 
CTTACCGCTG 
ATCTTTTACT 
AAAGGGAATA 
TTGAAGCATT 
AAATAAACAA 
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TGCTTGAGCA 

900 

TGTTGGCGTT 

960 

AGCAGGTGCT 

1020 

GTGACTTGGC 

1080 

TGGAAAAAAT 

1140 

AGGTGGCACC 

1200 

CTAGTATTGC 

1260 

AGATCATCCC 

1320 

TGGGTGCCGC 

1380 

TGGGCGATGG 

1440 

AGGACAAACT 

1500 

CCAAGACATA 

1560 

TGTGCGGGGT 

1620 

AGTGCAACAC 

1680 

GGAAGTCAGT 

1740 

ACGCCCACAC 

1800 

AGGAGGGGTG 

1860 

TAGGTGGAGG 

1920 

ACTACAGCCA 

1980 

AGCGCCAGGG 

2040 

CGTCTGTGAC 

2100 

GAGACTCCAC 

2160 

GCAGGAACCC 

2220 

ATGAAAGCAG 

2280 

GGGCGGGCCA 

2340 

ACGTCTTCTC 

2400 

GCAAGGCCCG 

2460 

TGCTCAAGGA 

2520 

GGCAGCAGTC 

2580 

TCGCGCGCGG 

2640 

ACGTCATGGC 

2700 

CCGGCACCAC 

2760 

CTCCCCCTGA 

2820 

GCTTATAATG 

2880 

TCACTGCATT 

2940 

CCCGGGTACC 

3000 

TCGGTCGTTC 

3060 

ACAGAATCAG 

3120 

AACCGTAAAA 

3180 

CACAAAAATC 

3240 

GCGTTTCCCC 

3300 

TACCTGTCCG 

3360 

TATCTCAGTT 

3420 

CAGCCCGACC 

3480 

GACTTATCGC 

3540 

GGTGCTACAG 

3600 

GGTATCTGCG 

3660 

GGCAAACAAA 

3720 

AGAAAAAAAG 

3780 

AACGAAAACT 

3840 

ATCCTTTTAA 

3900 

TCTGACAGTT 

3960 

TCATCCATAG 

4020 

TCTGGCCCCA 

4080 

GCAATAAACC 

4140 

TCCATCCAGT 

4200 

TTGCGCAACG 

4260 

GCTTCATTCA 

4320 

AAAAAAGCGG 

4380 

TTATCACTCA 

4440 

TGCTTTTCTG 

4500 

CCGAGTTGCT 

4560 

AAAGTGCTCA 

4620 

TTGAGATCCA 

4680 

TTCACCAGCG 

4740 

AGGGCGACAC 

4800 

TATCAGGGTT 

4860 

ATAGGGGTTC 

4920 


4951 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 530 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi). SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Leu Leu Leu Leu Leu Leu Leu Gly Leu Arg Leu Gin Leu Ser Leu 

15 10 15 

Glv He He Pro Val Glu Glu Glu Asn Pro Asp Phe Trp Asn Arg Glu 

20 25 . 30 

Ala Ala Glu Ala Leu Gly Ala Ala Lys Lys Leu Gin Pro Ala Gin Thr 

35 40 45 

Ala Ala Lys Asn Leu He He Phe Leu Gly Asp Gly Met Gly Val Ser 

50 55 60 

Thr Val Thr Ala Ala Arg He Leu Lys Gly Gin Lys Lys Asp Lys Leu 
65 70 75 80 

Glv Pro Glu He Pro Leu Ala Met Asp Arg Phe Pro Tyr Val Ala Leu 
^ r>r- an 


85 


03 ^ 

Ser Lys Thr Tyr Asn Val Asp Lys His Val Pro Asp Ser Gly Ala Thr 

100 105 110 

Ala Thr Ala Tyr Leu Cys Gly Val Lys Gly Asn Phe Gin Thr He Gly 

115 120 125 

Leu Ser Ala Ala Ala Arg Phe Asn Gin Cys Asn Thr Thr Arg Gly Asn 

130 135 140 

Glu val He Ser Val Met Asn Arg Ala Lys Lys Ala Gly Lys Ser Val 
145 150 155 160 

Glv val Val Thr Thr Thr Arg Val Gin His Ala Ser Pro Ala Gly Thr 

165 170 175 

Tvr Ala His Thr Val Asn Arg Asn Trp Tyr Ser Asp Ala Asp Val Pro 

^ 180 185 190 

Ala Ser Ala Arg Gin Glu Gly Cys Gin Asp He Ala Thr Gin Leu He 

195 200 205 

Ser Asn Met Asp He Asp Val He Leu Gly Gly Gly Arg Lys Tyr Met 

210 215 220 

Phe Arq Met Gly Thr Pro Asp Pro Glu Tyr Pro Asp Asp Tyr Ser Gin 
225 230 235 240 

Glv Glv Thr Arg Leu Asp Gly Lys Asn Leu Val Gin Glu Trp Leu Ala 

• 245 250 255 

Lvs Aro Gin Gly Ala Arg Tyr Val Trp Asn Arg Thr Glu Leu Met Gin 

260 265 270 

Ala Ser Leu Asp Pro Ser Val Thr His Leu Met Gly Leu Phe Glu Pro 

275 280 285 

Glv Asp Met Lys Tyr Glu lie His Arg Asp Ser Thr Leu Asp Pro Ser 

290 295 300 

Leu Met Glu Met Thr Glu Ala Ala Leu Arg Leu Leu Ser Arg Asn Pro 
305 310 315 320 

Arg Gly Phe Phe Leu Phe Val Glu Gly Gly Arg He Asp His Gly Hxs 

325 330 335 

His Glu Ser Arg Ala Tyr Arg Ala Leu Thr Glu Thr He Met Phe Asp 

340 345 350 

Asp Ala He Glu Arg Ala Gly Gin Leu Thr Ser Glu Glu Asp Thr Leu 

355 360 365 

ser Leu Val Thr Ala Asp His Ser His Val Phe Ser Phe Gly Gly Tyr 

370 375 380 

Pro Leu Arg Gly Ser Ser He Phe Gly Leu Ala Pro Gly Lys Ala Arg 
385 390 395 400 

Asp Arg Lys Ala Tyr Thr Val Leu Leu Tyr Gly Asn Gly Pro Gly Tyr 

405 410 415 

Val Leu Lys Asp Gly Ala Arg Pro Asp Val Thr Glu Ser Glu Ser Gly 

420 425 430 

Ser Pro Glu Tyr Arg Gin Gin Ser Ala Val Pro Leu Asp Glu Glu Thr 

435 440 445 

His Ala Gly Glu Asp Val Ala Val Phe Ala Arg Gly Pro Gin Ala His 
450 455 460 
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Leu 

Val 

His 

Gly 

Val 

Gin 

Glu 

Gin 

Thr 

Phe 

He 

Ala 

His 

Val 

Met 

Ala 

465 





470 





475 





480 

Phe 

Ala 

Ala 

Cys 

Leu 
485 

Glu 

Pro 

Tyr 

Thr 

Ala 
490 

Cys 

Asp 

Leu 

Ala 

Pro 
495 

Pro 

Ala 

Gly 

Thr 

Thr 

Asp 

Ala 

Ala 

His 

Pro 

Gly 

Arg 

Ser 

Val 

Val 

Pro 

Ala 



500 





505 




510 



Leu 

Leu 

Pro 
515 

Leu 

Leu 

Ala 

Gly 

Thr 
520 

Leu 

Leu 

Leu 

Leu 

Glu 
525 

Thr 

Ala 

Thr 


Ala Pro 
530 


(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 489 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 


He 

He 

Pro 

Val 

Glu 

Glu 

Glu 

Asn 

Pro 

Asp 

Phe 

Trp 

Asn 

Arg 

Glu 

Ala 

1 




5 





10 





15 


Ala 

Glu 

Ala 

Leu 

Gly 

Ala 

Ala 

Lys 

Lys 

Leu 

Gin 

Pro 

Ala 

Gin 

Thr 

Ala 




20 




25 





30 



Ala 

Lys 

Asn 

Leu 

He 

He 

Phe 

Leu 

Gly 

Asp 

Gly 

Met 

Gly 

Val 

Ser 

Thr 


35 





40 





45 




Val 

Thr 
50 

Ala 

Ala 

Arg 

He 

Leu 
55 

Lys 

Gly 

Gin 

Lys 

Lys 
60 

Asp 

Lys 

Leu 

Gly 

Pro 

Glu 

He 

Pro 

Leu 

Ala 

Met 

Asp 

Arg 

Phe 

Pro 

Tyr 

Val 

Ala 

Leu 

Ser 

65 





70 





75 





80 

Lys 

Thr 

Tyr 

Asn 

Val 

Asp 

Lys 

His 

Val 

Pro 

Asp 

Ser 

Gly 

Ala 

Thr 

Ala 




85 





90 





95 


Thr 

Ala 

Tyr 

Leu 

Cys 

Gly 

Val 

Lys 

Gly 

Asn 

Phe 

Gin 

Thr 

He 

Gly 

Leu 



100 





105 





110 



Ser 

Ala 

Ala 

Ala 

Arg 

Phe 

Asn 

Gin 

Cys 

Asn 

Thr 

Thr 

Arg 

Gly 

Asn 

Glu 



115 





120 




125 




Val 

He 
130 

Ser 

Val 

Met 

Asn 

Arg 
135 

Ala 

Lys 

Lys 

Ala 

Gly 
140 

Lys 

Ser 

Val 

Gly 

Val 

Val 

Thr 

Thr 

Thr 

Arg 

Val 

Gin 

His 

Ala 

Ser 

Pro 

Ala 

Gly 

Thr 

Tyr 

145 





150 





155 





160 

Ala 

His 

Thr 

Val 

Asn 
165 

Arg 

Asn 

Trp 

Tyr 

Ser 
170 

Asp 

Ala 

Asp 

Val 

Pro 
175 

Ala 

Ser 

Ala 

Arg 

Gin 

Glu 

Gly 

Cys 

Gin 

Asp 

He 

Ala 

Thr 

Gin 

. Leu 

He 

Ser 



180 





185 





190 



Asn 

Met 

Asp 
195 

He 

Asp 

Val 

He 

Leu 
200 

Gly 

Gly 

Gly 

Arg 

Lys 
205 

Tyr 

Met 

Phe 

Arg 

Met 
210 

Gly 

Thr 

Pro 

Asp 

Pro 
215 

Glu 

Tyr 

Pro 

Asp 

Asp 
220 

Tyr 

Ser 

Gin 

Gly 

Gly 

Thr 

Arg 

Leu 

Asp 

Gly 

Lys 

Asn 

Leu 

Val 

Gin 

Glu 

Trp 

Leu 

Ala 

Lys 

225 





230 





235 





240 

Arg 

Gin 

Gly 

Ala 

Arg 
245 

Tyr 

Val 

Trp 

Asn 

Arg 
250 

Thr 

Glu 

Leu 

Met 

Gin 
255 

Ala 

Ser 

Leu 

Asp 

Pro 

Ser 

Val 

Thr 

His 

Leu 

Met 

Gly 

Leu 

Phe 

Glu 

Pro 

Gly 



260 





265 





270 



Asp 

Met 

Lys 

Tyr 

Glu 

He 

His 

Arg 

Asp 

Ser 

Thr 

Leu 

Asp 

Pro 

Ser 

Leu 


275 




280 





285 




Met 

Glu 
290 

Met 

Thr 

Glu 

Ala 

Ala 
295 

Leu 

Arg 

Leu 

Leu 

Ser 
300 

Arg 

Asn 

Pro 

Arg 

Gly 

Phe 

Phe 

Leu 

Phe 

Val 

Glu 

Gly 

Gly 

Arg 

He 

Asp 

His 

Gly 

His 

His 

305 





310 





315 





320 

Glu 

Ser 

Arg 

Ala 

Tyr 

Arg 

Ala 

Leu 

Thr 

Glu 

Thr 

He 

Met 

Phe 

Asp 

Asp 




325 





330 





335 


Ala 

He 

Glu 

Arg 

Ala 

Gly 

Gin 

Leu 

Thr 

Ser 

Glu 

Glu 

Asp 

Thr 

Leu 

Ser 




340 




345 





350 



Leu 

val 

Thr 
355 

Ala 

Asp 

His 

Ser 

His 
360 

Val 

Phe 

Ser 

Phe 

Gly 
365 

Gly 

Tyr 

Pro 
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Leu 

Arg 

Gly 

Ser 

Ser 

lie 

Phe 

Gly 

Leu 


370 





375 



Arg 

Lys 

Ala 

Tyr 

Thr 

Val 

Leu 

Leu 

Tyr 

385 





390 




Leu 

Lys 

Asp 

Gly 

Ala 

Arg 

Pro 

Asp 

Val 




405 





Pro 

Glu 

Tyr 

Arg 

Gin 

Gin 

Ser 

Ala 

Val 



420 





425 

Ala 

Gly 

Glu 

Asp 

Val 

Ala 

Val 

Phe 

Ala 


435 





440 


Val 

His 

Gly 

Val 

Gin 

Glu 

Gin 

Thr 

Phe 


450 




455 



Ala 

Ala 

Cys 

Leu 

Glu 

Pro 

Tyr 

Thr 

Ala 

465 




470 




Gly 

Thr 

Thr 

Asp 

Ala 

Ala 

His 

Pro 

Gly 




485 
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Ala 

Pro 

Gly 
380 

Lys 

Ala 

Arg 

Asp 

Gly 

Asn 
395 

Gly 

Pro 

Gly 

Tyr 

Val 
400 

Thr 

Glu 

Ser 

Glu 

Ser 

Gly 

Ser 

410 





415 


Pro 

Leu 

Asp 

Glu 

Glu 

Thr 

His 




430 



Arg 

Gly 

Pro 

Gin 

Ala 

His 

Leu 



445 




He 

Ala 

His 
460 

Val 

Met 

Ala 

Phe 

Cys 

Asp 
475 

Leu 

Ala 

Pro 

Pro 

Ala 
480 


(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
CTGGACTCGA GNNNNNN 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 465 amino acids 

(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 ine ar 

(ii) MOLECULE TYPE: protein 
(V) FRAGMENT TYPE: internal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Ser Leu His Lys Ala 
15 

^sp Ser Leu 
30 


Met 

Trp 

Leu 

Val 

Thr 

Phe 

Leu 

Leu 

Leu 

Leu 

Asp 

Ser 

Leu 

1 



5 





10 




Arg 

Pro 

Glu 

Asp 

Val 

Gly 

Thr 

Ser 

Leu 

Tyr 

Phe 

Val 

Asn 



20 





25 





Gin 

Gin 

Val 
35 

Thr 

Phe 

Ser 

Ser 

Ser 
40 

Val 

Gly 

Val 

Val 

Val 
45 

Ala 

Ala 

Gly 

Ser 

Pro 

Ser 

Ala 

Ala 

Leu 

Arg 

Trp 

Tyr 

Leu 


50 




55 





60 


Asp 

Asp 

He 

Tyr 

Asp 

Val 

Pro 

His 

He 

Arg 

His 

Val 

His 

65 




70 





75 



Thr 

Leu 

Gin 

Leu 

Tyr 
85 

Pro 

Phe 

Ser 

Pro 

Ser 
90 

Ala 

Phe 

Asn 

His 

Asp 

Asn 

Asp 

Tyr 

Phe 

Cys 

Thr 

Ala 

Glu 

Asn 

Ala 

Ala 



100 





105 





Arg 

Ser 

Pro 

Asn 

He 

Arg 

Val 

Lys 

Ala 

Val 

Phe 

Arg 

Glu 


115 





120 





125 

Val 

Arg 
130 

Val 

Glu 

Asp 

Gin 

Arg 
135 

Ser 

Met 

Arg 

Gly 

Asn 
140 

Val 

Lys 

Cys 

Leu 

He 

Pro 

Ser 

Ser 

Val 

Gin 

Glu 

Tyr 

Val 

Ser 

145 




150 





155 



Trp 

Glu 

Lys 

Asp 

Thr 

Val 

Ser 

He 

He 

Pro 

Glu 

Asn 

Arg 


165 





170 




Thr 

Tyr 

His 

Gly 

Gly 

Leu 

Tyr 

He 

Ser 

Asp 

Val 

Gin 

Lys 



180 





185 






80 

Ser Phe He 
95 

Gly Lys He 
110 


160 

Phe Phe He 
175 

Glu Asp Ala 
190 
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Leu 

Ser 

Thr 

Tyr 

Arg 

Cys 

He 

Thr 

Lys 



195 





200 


Arg 


ber 

Asn 

c?xy 

Ala 

Arg 

Leu 

ber 


210 





215 



lie 

Pro 

Thr 

I le 

Leu 

Asp 

Gly 

Phe 

His 

225 





230 




His 

Thr 

vax 

Glu 

Leu 

Pro 

Cys 

Thr 

Ala 





245 




lie 

Arg 

Trp 

Leu 

Lys 

Asp 

Gly 

Arg 

Pro 




260 





265 

Thr 

Lys 

Arg 

lie 

Thr 

Gly 

Leu 

Thr 

I le 



275 





280 


Ser 

Gly 

Thr 

Tyr 

lie 

Cys 

Glu 

Val 

Thr 


290 





295 



Ala 

Thr 

Gly 

lie 

Leu 

Met 

Val 

He 

Asp 

305 





310 




Pro 

Lys 

Lys 

Leu 

Lys 

Thr 

Gly 

He 

Gly 





325 





Ala 

Leu 

Thr 

Gly 

Ser 

Pro 

Glu 

Phe 

Thr 




340 





345 

Glu 

Leu 

Val 

Leu 

Pro 

Asp 

Glu 

Ala 

I le 



355 





360 


Glu 

Thr 

Leu 

Leu 

He 

Thr 

Ser 

Ala 

Gin 


370 





375 



Gin 

Cys 

Phe 

Ala 

Thr 

Arg 

Lys 

Ala 

Gin 

385 





390 




Tin 

I ie 

AJ.a 

Leu 

txXU 

Asp 


1 nr 

Pro 

Arg 





405 





Lys 

Val 

Val 

Asn 

Pro 

Gly 

Glu 

Gin 

Phe 



420 





425 

Gly 

Ala 

Pro 

Pro 

Pro 

Thr 

val 

Thr 

Trp 


435 





440 


Val 

Arg 

Asp 

Gly 

ser 

His 

Arg 

Thr 

Asn 


450 455 


Thr 
465 
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His 

Lys 

Tyr 

Ser 
205 

Gly 

Glu 

Thr 

Val 

Thr 

Asp 
220 

Pro 

Ala 

Glu 

Ser 

Ser 

Gin 
235 

Glu 

Val 

Trp 

Ala 

Gly 
240 

Ser 

Gly 

Tyr 

Pro 

He 

Pro 

Ala 

250 





255 


Leu 

Pro 

Ala 

Asp 

Ser 
270 

Arg 

Trp 

Ser 

Asp 

Leu 

Arg 

Thr 

Glu 

Asp 




285 



Asn 

Thr 

Phe 
300 

Gly 

Ser 

Ala 

Glu 

Pro 

Leu 
315 

His 

Val 

Thr 

Leu 

Thr 
320 

Ser 

Thr 

Val 

He 

Leu 

Ser 

Cys 

330 





335 

He 

Arg 

Trp 

Tyr 

Arg 
350 

Asn 

Thr 

Ser 

He 

Arg 

Gly 
365 

Leu 

Ser 

Asn 

Lys 

Ser 

His 
380 

Ser 

Gly 

Ala 

Tyr 

Thr 

Ala 
395 

Gin 

Asp 

Phe 

Ala 

He 
400 

He 

Val 

Ser 

Ser 

Phe 

Ser 

Glu 

410 





415 


Ser 

Leu 

Met 

Cys 

Ala 

Ala 

Lys 





430 


Ala 

Leu 

Asp 

Asp 
445 

Glu 

Pro 

He 

Gin 

Tyr 

Thr 
460 

Met 

Ser 

Asp 

Gly 


(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1493 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

( ix ) FEATURE : 

(A) NAME /KEY: Coding Seqxience 

(B) LOCATION: 99... 1493 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 


GGCACGAGGG CGGCTGGGAG CGCGCTGAGC GGGGGAGAGG CGCTGCCGCA CGGCCGGCCA 60 
CAGGACCACC TCCCCGGAGA ATAGGGCCTC TTTATGGC ATG TGG CTG GTA ACT TTC 116 

Met Trp Leu Val Thr Phe 
1 5 

CTC CTG CTC CTG GAC TCT TTA CAC AAA GCC CGC CCT GAA GAT GTT GGC 164 
Leu Leu Leu Leu Asp Ser Leu His Lys Ala Arg Pro Glu Asp Val Gly 
10 15 20 

ACC AGO CTC TAC TTT GTA AAT GAC TCC TTG CAG CAG GTG ACC TTT TCC 212 
Thr Ser Leu Tyr Phe Val Asn Asp Ser Leu Gin Gin Val Thr Phe Ser 
25 30 35 

AGC TCC GTG GGG GTG GTG GTG CCC TGC CCG GCC GCG GGC TCC CCC AGC 260 
Ser Ser Val Gly Val Val Val Pro Cys Pro Ala Ala Gly Ser Pro Ser 
40 45 50 
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GCG GCC CTT CGA TGG TAG CTG GCC ACA GGG GAG GAC ATC TAG GAG GTG 
Ala Ala Leu Arg Trp Tyr Leu Ala Thr Gly Asp Asp He Tyr Asp Val 

60 65 70 


55 


308 


500 


CCG CAC ATC GGG GAG GTG GAG GCG AAG GGG ACG CTG GAG CTG TAG GCC 356 
Pro His He Arg His Val His Ala Asn Gly Thr Leu Gin Leu Tyr Pro 
75 80 85 

TTC TGC GCC TOG GCG TTG AAT AGC TTT ATC CAC GAC AAT GAC TAC TTC 404 
Phe Ser Pro Ser Ala Phe Asn Ser Phe He His Asp Asn Asp Tyr Phe 
90 95 100 

TGC ACC GCG GAG AAC GCT GCC GGG AAG ATC CGG AGC GCC AAC ATC CGC 452 
Cvs Thr Ala Glu Asn Ala Ala Gly Lys He Arg Ser Pro Asn He Arg 
105 110 115 

GTG AAA GCA GTT TTC AGG GAA CCC TAC ACC GTG CGG GTG GAG GAT CAA 
Val Lvs Ala Val Phe Arg Glu Pro Tyr Thr Val Arg Val Glu Asp Gin 
120 125 130 

AGG TCA ATG CGT GGG AAC GTG GCC GTG TTC AAG TGC CTC ATC CCC TCT 548 
Arq Ser Met Arg Gly Asn Val Ala Val Phe Lys Cys Leu He Pro Ser 
135 140 145 150 

TCA GTG CAG GAA TAT GTT AGC GTT GTA TCT TGG GAG AAA GAC ACA GTC 596 
Ser val Gin Glu Tyr Val Ser Val Val Ser Trp Glu Lys Asp Thr Val 
155 160 165 

TCC ATC ATC CCA GAA AAC AGG TTT TTT ATT ACC TAC CAC GGG GGG CTG 644 
Ser He He Pro Glu Asn Arg Phe Phe He Thr Tyr His Gly Gly Leu 
170 175 180 

TAC ATC TCT GAC GTA CAG AAG GAG GAC GCC CTC TCC ACC TAT CGC TGC 692 
Tvr He Ser Asp Val Gin Lys Glu Asp Ala Leu Ser Thr Tyr Arg Cys 
185 190 195 

ATC ACC AAG CAC AAG TAT AGC GGG GAG ACC CGG CAG AGC AAT GGG GCA 740 
He Thr Lys His Lys Tyr Ser Gly Glu Thr Arg Gin Ser Asn Gly Ala 
200 205^ 210 

CGC CTC TCT GTG ACA GAC GCT GCT GAG TCG ATC CCC ACC ATC CTG GAT 788 
Arq Leu Ser Val Thr Asp Pro Ala Glu Ser He Pro Thr He Leu Asp 
215 220 225 230 

GGC TTC CAC TCC CAG GAA GTG TGG GCC GGC CAC ACC GTG GAG CTG CCC 836 
Glv Phe His ser Gin Glu Val Trp Ala Gly His Thr Val Glu Leu Pro 
^ 235 240 245 

TGC ACC GCC TCG GGC TAC GCT ATC CCC GCC ATC CGC TGG CTC AAG GAT 884 
Cvs Thr Ala Ser Gly Tyr Pro He Pro Ala He Arg Trp Leu Lys Asp 
250 255 , 260 

GGC CGG CCC CTC CCG GCT GAC AGC CGC TGG ACC AAG CGC ATC ACA GGG 932 
Gly Arg Pro Leu Pro Ala Asp Ser Arg Trp Thr Lys Arg He Thr Gly 
^ 265 270 275 

CTG ACC ATC AGC GAC TTG CGG ACC GAG GAC AGC GGC ACC TAC ATT TGT 980 
Leu Thr He Ser Asp Leu Arg Thr Glu Asp Ser Gly Thr Tyr He Cys 
280 285 290 

GAG GTC ACC AAC ACC TTC GGT TCG GCA GAG GCC ACA GGC ATC CTC ATG 1028 
Glu Val Thr Asn Thr Phe Gly Ser Ala Glu Ala Thr Gly He Leu Met 
295 300 .305 310 

GTC ATT GAT CCC CTT CAT GTG ACC CTG ACA CCA AAG AAG CTG AAG ACC 1076 
Val He Asp Pro Leu His Val Thr Leu Thr Pro Lys Lys Leu Lys Thr 
" 315 320 325 
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GGC ATT GGC AGC ACG GTC ATC CTC TCC TGT GCC CTG ACG GGC TCC CCA 1124 

Gly lie Gly Ser Thr Val lie Leu Ser Cys Ala Leu Thr Gly Ser Pro 
330 335 340 

GAG TTC ACC ATC CGC TGG TAT CGC AAC ACG GAG CTG GTG CTG CCT GAC 1172 
Glu Phe Thr He Arg Trp Tyr Arg Asn Thr Glu Leu Val Leu Pro Asp 
345 350 355 

GAG GCC ATC TCC ATC CGT GGG CTC AGC AAC GAG ACG CTG CTC ATC ACC 1220 
Glu Ala He Ser He Arg Gly Leu Ser Asn Glu Thr Leu Leu He Thr 
360 365 370 

TCG GCC CAG AAG AGC CAT TCC GGG GCC TAC CAG TGC TTC OCT ACC CGC 12 68 

Ser Ala Gin Lys Ser His Ser Gly Ala Tyr Gin Cys Phe Ala Thr Arg 
375 ' 380 385 390 

AAG GCC CAG ACC GCC CAG GAC TTT GCC ATC ATT GCA CTT GAG GAT GGC 1316 
Lys Ala Gin Thr Ala Gin Asp Phe Ala He He Ala Leu Glu Asp Gly 
395 400 405 

ACG CCC CGC ATC GTC TCG TCC TTC AGC GAG AAG GTG GTC AAC CCC GGG 1364 
Thr Pro Arg He Val Ser Ser Phe Ser Glu Lys Val Val Asn Pro Gly 
410 415 420 

GAG CAG TTC TCA CTG ATG TGT GCG GCC AAG GGC GCC CCG CCC CCC ACG 1412 
Glu Gin Phe Ser Leu Met Cys Ala Ala Lys Gly Ala Pro Pro Pro Thr 
425 430 435 

GTC ACC TGG GCC CTC GAC GAT GAG CCC ATC GTG CGG GAT GGC AGC CAC 1460 
Val Thr Trp Ala Leu Asp Asp Glu Pro He Val Arg Asp Gly Ser His 
440 445 450 

CGC ACC AAC CAG TAC ACC ATG TCG GAC GGC ACC 1493 
Arg Thr Asn Gin Tyr Thr Met Ser Asp Gly Thr 
455 460 465 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 462 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE 

TYPE 

:: protein 









(xi) SEQUENCE 

DESCRIPTION: 

SEQ ID 

NO: 7 






Met 

Trp 

Leu 

Val 

Thr 

Phe 

Leu 

Leu 

Leu 

Leu 

Asp 

Ser 

Leu 

His 

Lys 

Ala 

1 



5 





10 





15 


Arg 

Pro 

Glu 

Asp 

Val 

Gly 

Thr 

Ser 

Leu 

Tyr 

Phe 

Val 

Asn 

Asp 

Ser 

Leu 



20 





25 





30 



Gin 

Gin 

Val 

Thr 

Phe 

Ser 

Ser 

Ser 

Val 

Gly 

Val 

Val 

Val 

Pro 

Cys 

Pro 



35 





40 





45 




Ala 

Ala 

Gly 

Ser 

Pro 

Ser 

Ala 

Ala 

Leu 

Arg 

Trp 

Tyr 

Leu 

Ala 

Thr 

Gly 


50 




55 





60 





Asp 

Asp 

He 

Tyr 

Asp 

Val 

Pro 

His 

He 

Arg 

His 

Val 

His 

Ala 

Asn 

Gly 

65 





70 





75 





80 

Thr 

Leu 

Gin 

Leu 

Tyr 

Pro 

Phe 

Ser 

Pro 

Ser 

Ala 

Phe 

Asn 

Ser 

Phe 

He 





85 





90 





95 


His 

Asp 

Asn 

Asp 

Tyr 

Phe 

Cys 

Thr 

Ala 

Glu 

Asn 

Ala 

Ala 

Gly 

Lys 

He 



100 





105 





110 



Arg 

Ser 

Pro 

Asn 

He 

Arg 

Val 

Lys 

Ala 

Val 

Phe 

Arg 

Glu 

Pro 

Tyr 

Thr 


115 





120 





125 




Val 

Arg 

Val 

Glu 

Asp 

Gin 

Arg 

Ser 

Met 

Arg 

Gly 

Asn 

Val 

Ala 

Val 

Phe 


130 





135 





140 





Lys 

Cys 

Leu 

He 

Pro 

Ser 

Ser 

Val 

Gin 

Glu 

Tyr 

Val 

Ser 

Val 

Val 

Ser 

145 




150 





155 





160 

Trp 

Glu 

Lys 

Asp 

Thr 

Val 

Ser 

He 

He 

Pro 

Glu 

Asn 

Arg 

Phe 

Phe 

He 



165 





170 





175 
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Thr Tyr His Gly Gly Leu Tyr He Ser Asp Val Gin Lys Glu Asp Ala 

180 185 190 

Leu ser Thr Tyr Arg Cys lie Thr Lys His Lys Tyr Ser Gly Glu Thr 

195 200 205 

Arg Gin Ser Asn Gly Ala Arg Leu Ser Val Thr Asp Pro Ala Glu Ser 

210 215 220 

He Pro Thr He Leu Asp Gly Phe His Ser Gin Glu Val Trp Ala Gly 
225 230 235 240 

His thr Val Glu Leu Pro Cys Thr Ala Ser Gly Tyr Pro He Pro Ala 

245 250 255 

He Arg Trp Leu Lys Asp Gly Arg Pro Leu Pro Ala Asp Ser Arg Trp 

260 265 270 

Thr Lys Arg He Thr Gly Leu Thr He Ser Asp Leu Arg Thr Glu Asp 

275 280 285 

Ser Gly Thr Tyr He Cys Glu Val Thr Asn Thr Phe Gly Ser Ala Glu 

290 295 300 

Ala Thr Gly He Leu Met Val He Asp Pro Leu His Val Thr Leu Thr 
305 310 315 320 

Pro Lys Lys Leu Lys Thr Gly He Gly Ser Thr Val He Leu Ser Cys . 

325 330 335 

Ala Leu Thr Gly Ser Pro Glu Phe Thr He Arg Trp Tyr Arg Asn Thr 

340 345 350 

Glu Leu Val Leu Pro Asp Glu Ala He Ser He Arg Gly Leu Ser Asn 

355 360 365 

Glu Thr Leu Leu He Thr Ser Ala Gin Lys Ser His Ser Gly Ala Tyr 

370 375 380 

Gin Cys Phe Ala Thr Arg Lys Ala Gin Thr Ala Gin Asp Phe Ala He 
385 390 395 400 

He Ala Leu Glu Asp Gly Thr Pro Arg He Val Ser Ser Phe Ser Glu 

405 410 415 

Lvs Val Val Asn Pro Gly Glu Gin Phe Ser Leu Met Cys Ala Ala Lys 

420 425 430 

Gly Ala Pro Pro Pro Thr Val Thr Trp Ala Leu Asp Asp Glu Pro He 

435 440 445 

Val Arg Asp Gly Ser His Arg Thr Asn Gin Tyr Thr Met Ser 
450 455 460 


(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 605 amino acids 

(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 


Met 

Lys 

Thr 

Pro 

Leu 

Leu 

Val 

Ser 

His 

Leu 

Leu 

Leu 

He 

Ser 

Leu 

Thr 

1 



5 





10 





15 


Ser 

Cys 

Leu 

Gly 

Glu 

Phe 

Thr 

Trp 

His 

Arg 

Arg 

Tyr 

Gly 

His 

Gly 

Val 



20 





25 





30 



Ser 

Glu 

Glu 

Asp 

Lys 

Gly 

Phe 

Gly 

Pro 

He 

Phe 

Glu 

Glu 

Gin 

Pro 

He 



35 



40 





45 




Asn 

Thr 

He 

Tyr 

Pro 

Glu 

Glu 

Ser 

Leu 

Glu 

Gly 

Lys 

Val 

Ser 

Leu 

Asn 


50 




55 





60 





Cys 
65 

Arg 

Ala 

Arg 

Ala 

Ser 
70 

Pro 

Phe 

Pro 

Val 

Tyr 
75 

Lys 

Trp 

Arg 

Met 

Asn 
80 

Asn 

Gly 

Asp 

Val 

Asp 
85 

Leu 

Thr 

Asn 

Asp 

Arg 
90 

Tyr 

Ser 

Met 

Val 

Gly 
95 

Gly 

Asn 

Leu 

Val 

He 

Asn 

Asn 

Pro 

Asp 

Lys 

Gin 

Lys 

Asp 

Ala 

Gly 

He 

Tyr 



100 





105 





110 



Tyr 

Cys 

Leu 

Ala 

Ser 

Asn 

Asn 

Tyr 

Gly 

Met 

Val 

Arg 

Ser 

Thr 

Glu 

Ala 

115 





120 





125 




Thr 

Leu 

Ser 

Phe 

Gly 

Tyr 

Leu 

Asp 

Pro 

Phe 

Pro 

Pro 

Glu 

Asp 

Arg 

Pro 


130 




135 





140 
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Glu 

Val 

Lys 

Val 

Lys 

Glu 

Gly 

Lys 

Gly 

145 





150 




Pro 

Tyr 

His 

Phe 

Pro 

Asp 

Asp 

Leu 

Ser 




165 




Phe 

Pro 

Val 

Phe 

He 

Thr 

Met 

Asp 

Lys 




180 




185 

Asn 

Gly 

Asn 

Leu 

Tyr 

He 

Ala 

Asn 

Val 


195 





200 


Tyr 

Ser 

Cys 

Phe 

Val 

Ser 

Ser 

Pro 

Ser 

210 





215 



Lys 

Phe 

He 

Pro 

Leu 

He 

Pro 

He 

Pro 

225 





230 




Pro 

Ala 

Asp 

He 

Val 

Val 

Gin 

Phe 

Lys 





245 




Gin 

Asn 

Val 

Thr 

Leu 

Glu 

Cys 

Phe 

Ala 




260 




265 

He 

Arg 

Trp 

Arg 

Lys 

Val 

Leu 

Glu 

Pro 



275 





280 


Ser 

Thr 

Ser 

Gly 

Ala 

Val 

Leu 

Lys 

He 


290 





295 



Glu 

Gly 

Leu 

Tyr 

Glu 

Cys 

Glu 

Ala 

Glu 

305 





310 




His 

Gin 

Ala 

Arg 

He 

Tyr 

Val 

Gin 

Ala 





325 





He 

Asn 

Asp 

Thr 

Glu 

Val 

Asp 

He 

Gly 




340 





345 

Val 

Ala 

Thr 

Gly 

Lys 

Pro 

He 

Pro 

Thr 



355 





360 


Tyr 

Ala 

Tyr 

His 

Lys 

Gly 

Glu 

Leu 

Arg 


370 





375 



Asn 

Ala 

Gly 

Met 

Tyr 

Gin 

Cys 

He 

Ala 

385 





390 




Tyr 

Ala 

Asn 

Ala 

Glu 

Leu 

Lys 

He 

Leu 




405 




Met 

Asn 

Pro 

Met 

Lys 

Lys 

Lys 

He 

Leu 




420 




425 

He 

He 

Glu 

Cys 

Lys 

Pro 

Lys 

Ala 

Ala 



435 




440 


Ser 

Lys 

Gly 

Thr 

Glu 

Trp 

Leu 

Val 

Asn 


450 





455 



Glu 

Asp 

Gly 

Ser 

Leu 

Glu 

He 

Asn 

Asn 

465 



470 




He 

Tyr 

Thr 

Cys 

Phe 

Ala 

Glu 

Asn 

Asn 





485 





Gly 

Thr 

Leu 

Val 

He 

Thr 

Asn 

Pro 

Thr 




500 





505 

Asn 

Ala 

Asp 

He 

Thr 

Val 

Gly 

Glu 

Asn 



515 





520 


Ser 

Phe 

Asp 

Pro 

Ser 

Leu 

Asp 

Leu 

Thr 


D JO 





n o c 
5 J 5 



Tyr 

Val 

He 

Asp 

Phe 

Asn 

Lys 

Glu 

He 

545 





550 




Asn 

Phe 

Met 

Leu 

Asp 

Ala 

Asn 

Gly 

Glu 





565 




Leu 

Lys 

His 

Ala 

Gly 

Arg 

Tyr 

Thr 

Cys 




580 





585 

Asn 

Ser 

Ser 

Ala 

Ser 

Ala 

Asp 

Leu 

Val 


595 600 


PCT/US97/20201 


Met 

Val 

Leu 

Leu 

Cys 

Asp 

Pro 


155 




160 

Tyr 

Arg 

Trp 

Leu 

Leu 

Asn 

Glu 

170 





175 


Arg 

Arg 

Phe 

Val 

Ser 
190 

Gin 

Thr 

Glu 

Ser 

Ser 

Asp 
205 

Arg 

Gly 

Asn 

He 

Thr 

Lys 
220 

Ser 

Val 

Phe 

Ser 

Glu 

Arg 
235 

Thr 

Thr 

Lys 

Pro 

Tyr 
240 

Asp 

He 

Tyr 

Thr 

Met 

Met 

Gly 

250 





255 

Leu 

Gly 

Asn 

Pro 

Val 

Pro 

Asp 





270 


Met 

Pro 

Thr 

Thr 
285 

Ala 

Glu 

He 

Phe 

Asn 

He 

Gin 

Leu 

Glu 

Asp 



300 




Asn 

He 
315 

Arg 

Gly 

Lys 

Asp 

Lys 
320 

Phe 

Pro 

Glu 

Trp 

Val 

Glu 

His 

330 





335 


Ser 

Asp 

Leu 

Tyr 

Trp 
350 

Pro 

Cys 

He 

Arg 

Trp 

Leu 

Lys 

Asn 

Gly 




365 



Leu 

Tyr 

Asp 
380 

Val 

Thr 

Phe 

Glu 

Glu 

Asn 
395 

Ala 

Tyr 

Gly 

Thr 

He 
400 

Ala 

Leu 

Ala 

Pro 

Thr 

Phe 

Glu 

410 





415 


Ala 

Ala 

Lys 

Gly 

Gly 
430 

Arg 

Val 

Pro 

Lys 

Pro 

Lys 
445 

Phe 

Ser 

Trp 

Ser 

Ser 

Arg 
460 

He 

Leu 

He 

Trp 

He 

Thr 
475 

Arg 

Asn 

Asp 

Gly 

Gly 
480 

Arg 

Gly 

Lys 

Ala 

Asn 

Ser 

Thr 

490 





495 


Arg 

He 

He 

Leu 

Ala 
510 

Pro 

He 

Ala 

Thr 

Met 

Gin 
525 

Cys 

Ala 

Ala 

Phe 

Val 

Trp 

Ser 

Phe 

Asn 

Gly 



540 




Thr 

Asn 

He 

His 

Tyr 

Gin 

Arg 


555 





560 

Leu 

Leu 

He 

Arg 

Asn 

Ala 

Gin 

570 





575 


Thr 

Ala 

Gin 

Thr 

He 

Val 

Asp 





590 


Val 

Arg 

Gly 

Pro 
605 





(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 615 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Met Trp Arg Gin Ser Thr lie Leu Ala Ala Leu Leu Val Ala Leu Leu 

15 10 15 

Cvs Ala Glv Ser Ala Glu Ser Lys Gly Asn Arg Pro Pro Arg lie Thr 

20 25 30 

Lys Gin Pro Ala Pro Gly Glu Leu Leu Phe Lys Val Ala Gin Gin Asn 

35 40 45 

Lys Glu Ser Asp Pro Glu Arg Asn Pro Phe lie lie Glu Cys Glu Ala 

50 55 60 

Asp Gly Gin Pro Glu Pro Glu Tyr Ser Trp lie Lys Asn Gly Lys Lys 
65 70 75 80 

Phe Asp Trp Gin Ala Tyr Asp Asn Arg Met Leu Arg Gin Pro Gly Arg 

85 90 95 

Gly Thr Leu Val lie Thr lie Pro Lys Asp Glu Asp Arg Gly His Tyr 

100 105 110 

Gin Cys Phe Ala Ser Asn Glu Phe Gly Thr Ala Thr Ser Asn Ser Val 

115 120 125 

Tyr Val Arg Lys Ala Glu Leu Asn Ala Phe Lys Asp Glu Ala Ala Lys 

130 135 140 

Thr Leu Glu Ala Val Glu Gly Glu Pro Phe Met Leu Lys Cys Ala Ala 
145 150 155 160 

Pro Asp Gly Phe Pro Ser Pro Thr Val Asn Trp Met lie Gin Glu Ser 

165 170 175 

lie Asp Gly Ser lie Lys Ser lie Asn Asn Ser Arg Met Thr Leu Asp 

180 185 190 

Pro Glu Gly Asn Leu Trp Phe Ser Asn Val Thr Arg Glu Asp Ala Ser 

195 200 205 

Ser Asp Phe Tyr Tyr Ala Cys Ser Ala Thr Ser Val Phe Arg Ser Glu 

210 215 220 

Tvr Lys lie Gly Asn Lys Val Leu Leu Asp Val Lys Gin Met Gly Val 
225 230 235 240 

Ser Ala Ser Gin Asn Lys His Pro Pro Val Arg Gin Tyr Val Ser Arg 

245 250 255 

Arg Gin Ser Ala Leu Arg Gly Lys Arg Met Glu Leu Phe Cys lie Tyr 

260 265 270 

Gly Gly Thr Pro Leu Pro Gin Thr Val Trp Ser Lys Asp Gly Gin Arg 

275 280 285 

He Gin Trp Ser Asp Arg He Thr Gin Gly His Tyr Gly Lys Ser Leu 

290 295 300 

Val He Arg Gin Thr Asn Phe Asp Asp Ala Gly Thr Tyr Thr Cys Asp 
305 310 315 320 

Val Ser Asn Gly Val Gly Asn Ala Gin Ser Phe Ser He He Leu Asn 

325 330 335 

Val Asn Ser Val Pro Tyr Phe Thr Lys Glu Pro Glu He Ala Thr Ala 

340 345 350 

Ala Glu Asp Glu Glu Val Val Phe Glu Cys Arg Ala Ala Gly Val Pro 

355 360 365 

Glu Pro Lys He Ser Trp He His Asn Gly Lys Pro He Glu Gin Ser 

370 375 380 

Thr Pro Asn Pro Arg Arg Thr Val Thr Asp Asn Thr He Arg He He 
385 390 395 400 

Asn Leu Val Lys Gly Asp Thr Gly Asn Tyr Gly Cys Asn Ala Thr Asn 

405 410 415 

Ser Leu Gly Tyr Val Tyr Lys Asp Val Tyr Leu Asn Val Gin Ala Glu 

420 425 430 

Pro Pro Thr He Ser Glu Ala Pro Ala Ala Val Ser Thr Val Asp Gly 

435 440 445 

Arg Asn Val Thr He Lys Cys Arg Val Asn Gly Ser Pro Lys Pro Leu 

450 455 460 

Val Lys Trp Leu Arg Ala Ser Asn Trp Leu Thr Gly Gly Arg Tyr Asn 
465 470 475 480 

Val Gin Ala Asn Gly Asp Leu Glu He Gin Asp Val Thr Phe Ser Asp 

485 490 495 

Ala Glv Lys Tyr Thr Cys Tyr Ala Gin Asn Lys Phe Gly Glu He Gin 

^ ^ 50O 505 510 

Ala Asp Gly Ser Leu Val Val Lys Glu His Thr He Thr Gin Glu Pro 
515 520 525 
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Gin 

Asn 
530 

Tyr 

Glu 

Val 

Ala 

Ala 
535 

Gly 

Glu 

Ala 

His 

Asp 

Asp 

Thr 

Leu 

Glu 

545 





550 



Gly 

Gin 

Ser 

lie 

Asp 
565 

Phe 

Glu 

Ala 

Asp 

Asn 

Ser 

Leu 

Thr 

He 

Ala 

Lys 



580 




Tyr 

Thr 

Cys 
595 

Val 

Ala 

Arg 

Thr 

Arg 
600 

Asn 

Leu 
610 

lie 

Val 

Gin 

Asp 

Val 
615 
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Gin 

Ser 

Ala 

Thr 

Phe 

Arg 

Cys 

Asn 




540 





He 

Glu 

He 

Asp 

Trp 

Trp 

Lys 

Asp 



555 





560 

Gin 

Pro 

Arg 

Phe 

Val 

Lys 

Thr 

Asn 


570 





575 


Thr 

Met 

Glu 

Leu 

Asp 

Ser 

Gly 

Glu 

585 





590 


Leu 

Asp 

Glu 

Ala 

Thr 

Ala 

Arg 

Ala 





605 




(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 611 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID. NO: 10: 


Met 

Val 

Val 

Ala 

Leu 

Arg 

Tyr 

Val 

Trp 

Pro 

Leu 

Leu 

Leu 

Cys 

Ser 

Pro 

• 1 




5 





10 





15 


Cys 

Leu 

Leu 

He 

Gin 

He 

Pro 

Glu 

Glu 

Tyr 

Glu 

Gly 

His 

His 

Val 

Met 



20 





25 





30 



Glu 

Pro 

Pro 

Val 

He 

Thr 

Glu 

Gin 

Ser 

Pro 

Arg 

Arg 

Leu 

Val 

Val 

Phe 



35 





40 



45 




Pro 

Thr 
50 

Asp 

Asp 

He 

Ser 

Leu 
55 

Lys 

Cys 

Glu 

Ala 

Ser 
60 

Gly 

Lys 

Pro 

Glu 

Val 

Gin 

Phe 

Arg 

Trp 

Thr 

Arg 

Asp 

Gly 

Val 

His 

Phe 

Lys 

Pro 

Lys 

Glu 

65 





70 





75 





80 

Glu 

Leu 

Gly 

Val 

Thr 
85 

Val 

Tyr 

Gin 

Ser 

Pro 
90 

His 

Ser 

Gly 

Ser 

Phe 
95 

Thr 

He 

Thr 

Gly 

Asn 
100 

Asn 

Ser 

Asn 

Phe 

Ala 
105 

Gin 

Arg 

Phe 

Gin 

Gly 
110 

He 

Tyr 

Arg 

Cys 

Phe 

Ala 

Ser 

Asn 

Lys 

Leu 

Gly 

Thr 

Ala 

Met 

Ser 

His 

Glu 

He 


115 





120 





125 




Arg 

Leu 
130 

Met 

Ala 

Glu 

Gly 

Ala 
135 

Pro 

Lys 

Trp 

Pro 

Lys 
140 

Glu 

Thr 

Val 

Lys 

Pro 

Val 

Glu 

Val 

Glu 

Glu 

Gly 

Glu 

Ser 

Val 

Val 

Leu 

Pro 

Cys 

Asn 

Pro 

145 





150 





155 





160 

Pro 

Pro 

Ser 

Ala 

Glu 

Pro 

Leu 

Arg 

He 

Tyr 

Trp 

Met 

Asn 

Ser 

Lys 

He 





165 




170 




175 


Leu 

His 

He 

Lys 
180 

Gin 

Asp 

Glu 

Arg 

Val 
185 

Thr 

Met 

Gly 

Gin 

Asn 
190 

Gly 

Asn 

Leu 

Tyr 

Phe 
195 

Ala 

Asn 

Val 

Leu 

Thr 
200 

Ser 

Asp 

Asn 

His 

Ser 
205 

Asp 

Tyr 

He 

Cys 

His 
210 

Ala 

His 

Phe 

Pro 

Gly 
215 

Thr 

Arg^ 

.,Thr 

He 

He 
220 

Gin 

Lys 

Glu 

Pro 

He 

Asp 

Leu 

Arg 

Val 

Lys 

Ala 

Thr 

Asn 

Ser 

Met 

He 

Asp 

Arg 

Lys 

Pro 

225 





230 





235 





240 

Arg 

Leu 

Leu 

Phe 

Pro 
245 

Thr 

Asn 

Ser 

Ser 

Ser 
250 

His 

Leu 

Val 

Ala 

Leu 
255 

Gin 

Gly 

Gin 

Pro 

Leu 

Val 

Leu 

Glu 

Cys 

He 

Ala 

Glu 

Gly 

Phe 

Pro 

Thr 

Pro 



260 





265 




270 



Thr 

He 

Lys 
275 

Trp 

Leu 

Arg 

Pro 

Ser 
280 

Gly 

Pro 

Met 

Pro 

Ala 
285 

Asp 

Arg 

Val 

Thr 

Tyr 

Gin 

Asn 

His 

Asn 

Lys 

Thr 

Leu 

Gin 

Leu 

Leu 

Lys 

Val 

Gly 

Glu 


290 





295 





300 




Glu 

Asp Asp 

Gly 

Glu 

Tyr 

Arg 

Cys 

Leu 

Ala 

Glu 

Asn 

Ser 

Leu 

Gly 

Ser 

305 





310 





315 





320 

Ala 

Arg 

His 

Ala 

Tyr 
325 

Tyr 

Val 

Thr 

Val 

Glu 
330 

Ala 

Ala 

Lys 

Tyr 

Arg 
335 

He 

Gin 

Arg 

Gly 

Ala 

Leu 

He 

Leu 

Ser 

Asn 

Val 

Gin 

Pro 

Ser 

Asp 

Thr 

Met 


340 345 350 
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val 

Tnr 

Gin 

Cys 

Glu 

Ala 

Arg 

Asn 

Arg 

His 

Gly 

Leu 

Leu 

Leu 

Ala 

Asn 



355 




360 





365 




Ala 

Tyr 

He 

Tyr 

Val 

Val 

Gin 

Leu 

Pro 

Ala 

Lys 

J. xe 

Leu 

1. nr 


Asp 


370 




37 5 





380 





Asn 

Gin 

Thr 

Tyr 

Met 

Ala 

Val 

Pro 

Tyr 

Trp 

Leu 

His 

Lys 

Pro 

vj±n 

oer 

385 




390 





395 





4UU 

His 

Leu 

Tyr 

Gly 

Pro 

Gly 

Glu 

Thr 

Ala 

Arg 

Leu 

Asp 


m n 

U XII 

V a. X 

VsXIl 



405 





410 





** Xb 


Gly 

Arg 

Pro 

Gin 

Pro 

Glu 

Val 

Thr 

Trp 

Arg 

He 

Asn 

ij±y 

X xe 

rTO 

vax 


420 





425 





^ o r\ 



Glu 

Glu 

Leu 

Ala 

Lys 

Asp 

Gin 

Gin 

Gly 

Ser 

Thr 

Ala 

Tyr 

Leu 

Leu 

Cys 



435 



440 





445 




Lys 

Ala 

Phe 

Gly 

Ala 

Pro 

Val 

Pro 

Ser 

Val 

Gin 

Trp 

Leu 

Asp 

(jXU 

Asp 

450 




455 





460 


r\.Xa. 



Gly 

Thr 

Thr 

Val 

Leu 

Gin 

Asp 

Glu 

Arg 

Phe 

Phe 

Pro 

Tyr 

Asn 

tjxy 

465 





470 





475 





480 

Thr 

Leu 

Gly 

He 

Arg 

Asp 

Leu 

Gin 

Ala 

Asn 

Asp 

rn V- 

Tnr 


Arg 

Tyr 

fne 




485 





490 





4y b 


Cys 

Leu 

Ala 

Ala 

Asn 

Asp 

Gin 

Asn 

Asn 

Val 

Thr 

I le 

Met 

Ala 

Asn 

Leu 



500 





505 





510 



Lys 

val 

Lys 

Asp 

Ala 

Thr 

Gin 

He 

Thr 

Gin 

Gly 

Pro 

Arg 

Ser 

X nr 

X xe 


515 




520 





525 




Glu 

Lys 

Lys 

Gly 

Ser 

Arg 

Val 

Thr 

Phe 

Thr 

Cys 

Gin 

Ala 

Ser 

r^ne 

Asp 


530 



535 





540 





Pro 

Ser 

Leu 

Gin 

Pro 

Ser 

He 

Thr 

Trp 

Arg 

Gly 

Asp 

Gly Arg 

Asp 

Leu 

545 





550 




c c c 






Gin 

Glu 

Leu 

Gly 

Asp 

Ser 

Asp 

Lys 

Tyr 

Phe 

He 

Glu 

Asp 

Gly 

Arg 

Leu 




565 





570 





575 


Val. He 

His 

Ser 

Leu 

Asp 

Tyr 

Ser 

Asp 

Gin 

Gly 

Asn 

Tyr 

Ser 

Cys 

Val 




580 



585 





590 



Ala 

Ser 

Thr 

Glu 

Leu 

Asp 

Val 

Val 

Glu 

Ser 

Arg 

Ala 

Gin 

Leu 

Leu 

Val 



595 




600 





605 





Val Gly Ser 
610 


(2) INFORl^ATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 612 amino acids 

(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 


Met 

Met 

Lys 

Glu 

Lys 

Ser 

He 

Ser 

Ala 

Ser 

Lys 

Ala 

Ser 

Leu 

Val 

Phe 

1 



5 





10 





15 


Phe 

Leu 

Cys 

Gin 

Met 

He 

Ser 

Ala 

Leu 

Asp 

Val 

Pro 

Leu 

Asp 

Ser 

Lys 



20 





25 





30 



Leu 

Leu 

Glu 

Glu 

Leu 

Ser 

Gin 

Pro 

Pro 

Thr 

He 

Thr 

Gin 

Gin 

Ser 

Pro 


35 





40 





45 




Lys 

Asp 

Tyr 

He 

Val 

Asp 

Pro 

Arg 

Glu 

Asn 

He 

Val 

He 

Gin 

Cys 

Glu 

50 




55 





60 





Ala 

Lys 

Gly 

Lys 

Pro 

Pro 

Pro 

Ser 

Phe 

Ser 

Trp 

Thr 

Arg 

Asn 

Gly 

Thr 

65 


70 





75 





80 

His 

Phe 

Asp 

He 

Asp 

Lys 

Asp 

Ala 

Gin 

Val 

Thr 

Met 

Lys 

Pro 

Asn 

Ser 




85 





90 





95 


Gly 

Thr 

Leu 

Val 

Val 

Asn 

He 

Met 

Asn 

Gly 

Val 

Lys 

Ala 

Glu 

Ala 

Tyr 



100 





105 





110 



Glu 

Gly 

Val 

Tyr 

Gin 

Cys 

Thr 

Ala 

Arg 

Asn 

Glu 

Arg 

Gly 

Ala 

Ala 

He 


115 




120 





125 




Ser 

Asn 

Asn 

He 

Val 

He 

Arg 

Pro 

Ser 

Arg 

Ser 

Pro 

Leu 

Trp 

Thr 

Lys 

130 





135 





140 





Glu 

Lys 

Leu 

Glu 

Pro 

Asn 

His 

Val 

Arg 

Glu 

Gly 

Asp 

Ser 

Leu 

Val 

Leu 

145 




150 





155 


He 

Phe 


160 

Asn 

Cys 

Arg 

Pro 

Pro 

Val 

Gly 

Leu 

Pro 

Pro 

Pro 

He 

Trp 

Met 



165 





170 





175 
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Asp 

Asn 

Ala 

Phe 

Gin 

Arg 

Leu 

Pro 

Gin 

Ser 

Glu 

Arg 

Val 

Ser 

Gin 

Gly 




180 





185 





190 


Leu 

Asn 

Gly 

Asp 

Leu 

Tyr 

Phe 

Ser 

Asn 

Val 

Gin 

Pro 

Glu 

Asp 

Thr 

Arg 



195 





200 





205 



Val 

Asp 

Tyr 

lie 

Cys 

Tyr 

Ala 

Arg 

Phe 

Asn 

His 

Thr 

Gin 

Thr 

He 

Gin 


210 





215 





220 





Gin 

Ly s 

Gin 

Pro 

lie 

Ser 

Val 

Lys 

Val 

Phe 

Ser 

Thr 

Lys 

Pro 

Val 

Thr 

225 




230 




235 





240 

Glu 

Arg 

Pro 

Pro 

Val 

Leu 

Leu 

Thr 

Pro 

Met 

Gly 

Ser 

Thr 

Ser 

Asn 

Lys 




245 





250 





255 

Val 

Glu 

Leu 

Arg 

Gly 

Asn 

Val 

Leu 

Leu 

Leu 

Glu 

Cys 

He 

Ala 

Ala 

Gly 




260 





265 





270 


Leu 

Pro 

Thr 

Pro 

Val 

lie 

Arg 

Trp 

He 

Lys 

Glu 

Gly 

Gly 

Glu 

Leu 

Pro 



275 





280 





285 




Ala 

Asn 

Arg 

Thr 

Phe 

Phe 

Glu 

Asn 

Phe 

Lys 

Lys 

Thr 

Leu 

Lys 

He 

He 


290 





295 





300 





Asp 

Val 

Ser 

Glu 

Ala 

Asp 

Ser 

Gly 

Asn 

Tyr 

Lys 

Cys 

Thr 

Ala 

Arg 

Asn 

305 





310 





315 





320 

Thr 

Leu 

Gly 

Ser 

Thr 

His 

His 

Val 

He 

Ser 

Val 

Thr 

Val 

Lys 

Ala 

Ala 




325 





330 





335 


Pro 

Tyr 

Trp 

lie 

Thr 

Ala 

Pro 

Arg 

Asn 

Leu 

Val 

Leu 

Ser 

Pro 

Gly 

Glu 



340 





345 





350 


Asp 

Gly 

Thr. 

Leu 

lie 

Cys 

Arg 

Ala 

Asn 

Gly 

Asn 

Pro 

Lys 

Pro 

Ser 

He 


355 





360 





365 




Ser 

Trp 

Leu 

Thr 

Asn 

Gly 

Val 

Pro 

He 

Ala 

He 

Ala 

Pro 

Glu 

Asp 

Pro 


370 





375 





380 




Ser 

Arg 

Lys 

Val 

Asp 

Gly 

Asp 

Thr 

He 

He 

Phe 

Ser 

Ala 

Val 

Gin 

Glu 

385 





390 





395 





400 

Arg 

Ser 

Ser 

Ala 

Val 

Tyr 

Gin 

Cys 

Asn 

Ala 

Ser 

Asn 

Glu 

Tyr 

Gly 

Tyr 





405 





410 





415 


Leu 

Leu 

Ala 

Asn 

Ala 

Phe 

Val 

Asn 

Val 

Leu 

Ala 

Glu 

Pro 

Pro 

Arg 

He 




420 





425 





430 


Leu 

Thr 

Pro 

Ala 

Asn 

Lys 

Leu 

Tyr 

Gin 

Val 

He 

Ala 

Asp 

Ser 

Pro 

Ala 



435 





440 





445 




Leu 

lie 

Asp 

Cys 

Ala 

Tyr 

Phe 

Gly 

Ser 

Pro 

Lys 

Pro 

Glu 

He 

Glu 

Trp 


450 





455 





460 





Phe 

Arg 

Gly 

Val 

Lys 

Gly 

Ser 

He 

Leu 

Arg 

Gly 

Asn 

Glu 

Tyr 

Val 

Phe 

465 





470 





475 





480 

His 

Asp 

Asn 

Gly 

Thr 

Leu 

Glu 

He 

Pro 

Val 

Ala 

Gin 

Lys 

Asp 

Ser 

Thr 




485 





490 





495 


Gly 

Thr 

Tyr 

Thr 

Cys 

Val 

Ala 

Arg 

Asn 

Lys 

Leu 

Gly 

Lys 

Thr 

Gin 

Asn 




500 





505 





510 



Glu 

Val 

Gin 

Leu 

Glu 

Val 

Lys 

Asp 

Pro 

Thr 

Met 

He 

He 

Lys 

Gin 

Pro 



515 





520 





525 




Gin 

Tyr 

Lys 

Val 

lie 

Gin 

Arg 

Ser 

Ala 

Gin 

Ala 

Ser 

Phe 

Glu 

Cys 

Val 


530 





535 





540 





lie 

Lys 

His 

Asp 

Pro 

Thr 

Leu 

He 

Pro 

Thr 

val 

He 

Trp 

Leu 

Lys 

Asp 

545 





550 





555 





560 

Asn 

Asn 

Glu 

Leu 

Pro 

Asp 

Asp 

Glu 

Arg 

Phe 

Leu 

Val 

Gly 

Lys 

Asp 

Asn 





565 





570 





575 


Leu 

Thr 

lie 

Met 

Asn 

Val 

Thr 

Asp 

Lys 

Asp 

Asp 

Gly 

Thr 

Tyr 

Thr 

Cys 




580 





585 





590 



lie 

Val 

Asn 

Thr 

Thr 

Leu 

Asp 

Ser 

Val 

Ser 

Ala 

Ser 

Ala 

Val 

Leu 

Thr 


595 600 605 


Val Val Ala Ala 
610 


(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 607 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Gly Thr Ala Thr Arg Arg Lys Pro His . Leu Leu Leu Val Ala Ala 

1 5 10 15 

Val Ala Leu Val Ser Ser Ser Ala Trp Ser Ser Ala Leu Gly Ser Gin 

20 25 -50 

Thr Thr Phe Gly Pro Val Phe Glu Asp Gin Pro Leu Ser Val Leu Phe 

35 40 45 

Pro Glu Glu Ser Thr Glu Glu Gin Val Leu Leu Ala Cys Arg Ala Arg 

50 55 60 

Ala ser Pro Pro Ala Thr Tyr Arg Trp Lys Met Asn Gly Thr Glu Met 

Lys Leu Glu Pro Gly Ser Arg His Gin Leu Val Gly Gly Asn Leu Val 
85 

He Met Asn Pro Thr Lys Ala Gin Asp Ala Gly Val Tyr Gin Cys Leu 

100 105 110 

Ala ser Asn Pro Val Gly Thr Val Val Ser Arg Glu Ala He Leu Arg 

115 120 125 

Phe Glv Phe Leu Gin Glu Phe Ser Lys Glu Glu Arg Asp Pro Val Lys 

130 135 140 

Ala His Glu Gly Trp Gly Val Met Leu Pro Cys Asn Pro Pro Ala Hxs 
145 150 155 l&O 

Tyr Pro Gly Leu Ser Tyr Arg Trp Leu Leu Asn Glu Phe Pro Asn Phe 

165 170 175 

He Pro Thr Asp Gly Arg His Phe Val Ser Gin Thr Thr Gly Asn Leu 

180 185 190 

Tyr He Ala Arg Thr Asn Ala Ser Asp Leu Gly Asn Tyr Ser Cys Leu 

195 200 205 

Ala Thr ser His Met Asp Phe Ser Thr Lys Ser Val Phe Ser Lys Phe 

210 215 220 

Ala Gin Leu Asn Leu Ala Ala Glu Asp Thr Arg Leu Phe Ala Pro Ser 
ooR 230 235 

lie Lys Ala Arg Phe Pro Ala Glu Thr Tyr Ala Leu Val Gly Gin Gin 

245 250 z:>a 

val Thr Leu Glu Cys Phe Ala Phe Gly Asn Pro Val Pro Arg He Lys 

260 265 27U 

Trp Arg Lys Val Asp Gly Ser Leu Ser Pro Gin Trp Thr Thr Ala Glu 
275 280 285 


Pro Thr Leu Gin He Pro Ser Val Ser Phe Glu Asp Glu Gly Thr Tyr 

290 295 300 

Glu cys Glu Ala Glu Asn Ser Lys Gly Arg Asp Thr Val Gin Gly Arg 

305 310 ZZ 

lie He val Gin Ala Gin Pro Glu Trp Leu Lys Val He Ser Asp Thr 

325 330 335 

Glu Ala Asp He Gly Ser Asn Leu Arg Trp Gly Cys Ala Ala Ala Gly 

340 345 -^^^ 

Lvs Pro Arg Pro Thr Val Arg Trp Leu Arg Asn Gly Glu Pro Leu Ala 

' 355 360 365 

ser Gin Asn Arg Val Glu Val Leu Ala Gly Asp Leu Arg Phe Ser Lys 

370 375 380 

Leu ser Leu Glu Asp Ser Gly Met Tyr Gin Cys Val Ala Glu Asn Lys 
385 390 395 

His Gly Thr He Tyr Ala ser Ala Glu Leu Ala Val Gin Ala Leu Ala 

405 410 41b 

Pro Asp Phe Arg Leu Asn Pro Val Arg Arg Leu He Pro Ala Ala Arg 

420 425 430 

Glv Gly Glu He Leu He Pro Cys Gin Pro Arg Ala Ala Pro Lys Ala 

' ^ 435 440 445 

val val Leu Trp Ser Lys Gly Thr Glu He Leu Val Asn Ser Ser Arg 

450 455 460 

Val Thr val Thr Pro Asp Gly Thr Leu He He Arg Asn He Ser Arg 
465 470 475 480 

ser Asp Glu Gly Lys Tyr Thr Cys Phe Ala Glu Asn Phe Met Gly Lys 

485 490 495 

Ala Asn Ser Thr Gly He Leu Ser Val Arg Asp Ala Thr Lys He Thr 

500 505 =10 

Leu Ala P^o Ser Ser Ala Asp He Asn Leu Gly Asp Asn Leu Thr Leu 
515 520 525 
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Gin Cys His Ala Ser His 
530 

Thr Leu Asp Asp Phe Pro 
545 550 
Arg Arg Thr Asn Val Lys 
565 

Ala Gin Leu Arg His Gly 
580 

Val Asp Ser Ala Ser Lys 
595 
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Asp 

Pro 

Thr 

Met 

Asp 

Leu 

535 





540 

He 

Asp 

Phe 

Asp 

Lys 
555 

Pro 

Glu 

Thr 

He 

Gly 
570 

Asp 

Leu 

Gly 

Lys 

Tyr 
585 

Thr 

Cys 

Met 

Glu 

Ala 
600 

Thr 

Val 

Leu 

Val 
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Thr 

Phe 

Thr 

Trp 

Gly 

Gly 

His 

Tyr 




560 

Thr 

He 

Leu 

Asn 



575 


Ala 

Gin 

Thr 

Val 


590 



Arg 

Gly 

Pro 


605 





(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 596 amino acids 

( B ) TYPE : amino ac id 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 


Met 

Leu 

Ser 

Trp 

Lys 

Gin 

Leu 

He 

Leu 

Leu 

Ser 

Phe 

He 

Gly 

Cys 

Leu 

1 




5 





10 





15 


Ala 

Gly 

Glu 

Leu 

Leu 

Leu 

Gin 

Gly 

Pro 

Val 

Phe 

Val 

Lys 

Glu 

Pro 

Ser 



20 




25 




30 



Asn 

Ser 

He 
35 

Phe 

Pro 

Val 

Gly 

Ser 
40 

Glu 

Asp 

Lys 

Lys 

He 
45 

Thr 

Leu 

Asn 

Cys 

Glu 

Ala 

Arg 

Gly 

Asn 

Pro 

Ser 

Pro 

His 

Tyr 

Arg 

Trp 

Gin 

Leu 

Asn 

50 





55 





60 





Gly 

Ser 

Asp 

He 

Asp 

Thr 

Ser 

Leu 

Asp 

His 

Arg 

Tyr 

Lys 

Leu 

Asn 

Gly 

65 





70 





75 





80 

Gly 

Asn 

Leu 

He 

Val 

He 

Asn 

Pro 

Asn 

Arg 

Asn 

Trp 

Asp 

Thr 

Gly 

Ser 




85 





90 





95 


Tyr 

Gin 

Cys 

Phe 

Ala 

Thr 

Asn 

Ser 

Leu 

Gly 

Thr 

He 

Val 

Ser 

Arg 

Glu 


100 





105 





110 



Ala 

Lys 

Leu 

Gin 

Phe 

Ala 

Tyr 

Leu 

Glu 

Asn 

Phe 

Lys 

Ser 

Arg 

Met 

Arg 


115 





120 





125 



Ser 

Arg 
130 

Val 

Ser 

Val 

Arg 

Glu 
135 

Gly 

Gin 

Gly 

Val 

Val 
140 

Leu 

Leu 

Cys 

Gly 

Pro 

Pro 

Pro 

His 

Ser 

Gly 

Glu 

Leu 

Ser 

Tyr 

Ala 

Trp 

Val 

Phe 

Asn 

Glu 

145 





150 





155 





160 

Tyr 

Pro 

Ser 

Phe 

Val 

Glu 

Glu 

Asp 

Ser 

Arg 

Arg 

Phe 

Val 

Ser 

Gin 

Glu 




165 





170 





175 


Thr 

Gly 

His 

Leu 
180 

Tyr 

He 

Ala 

Lys 

Val 
185 

Glu 

Pro 

Ser 

Asp 

Val 
190 

Gly 

Asn 

Tyr 

Thr 

Cys 

Val 

Val 

Thr 

Ser 

Thr 

Val 

Thr 

Asn 

Ala 

Arg 

Val 

Leu 

Gly 


195 





200 





205 




Ser 

Pro 
210 

Thr 

Pro 

Leu 

Val 

Leu 
215 

Arg 

Ser 

Asp 

Gly 

Val 
220 

Met 

Gly 

Glu 

Tyr 

Glu 

Pro 

Lys 

He 

Glu 

Leu 

Gin 

Phe 

Pro 

Glu 

Thr 

Leu 

Pro 

Ala 

Ala 

Lys 

225 




230 





235 





240 

Gly 

Ser 

Thr 

Val 

Lys 
245 

Leu 

Glu 

Cys 

Phe 

Ala 
250 

Leu 

Gly 

Asn 

Pro 

Val 
255 

Pro 

Gin 

He 

Asn 

Trp 
260 

Arg 

Arg 

Ser 

Asp 

Gly 
265 

Met 

Pro 

Phe 

Pro 

Thr 
270 

Lys 

He 

Lys 

Leu 

Arg 

Lys 

Phe 

Asn 

Gly 

Val 

Leu 

Glu 

He 

Pro 

Asn 

Phe 

Gin 

Gin 


275 




280 





285 




Glu 

Asp 
290 

Thr 

Gly 

Ser 

Tyr 

Glu 
295 

Cys 

He 

Ala 

Glu 

Asn 
300 

Ser 

Arg 

Gly 

Lys 

Asn 

Val 

Ala 

Arg 

Gly 

Arg 

Leu 

Thr 

Tyr 

Tyr 

Ala 

Lys 

Pro 

Tyr 

Trp 

Val 

305 





310 





315 





320 

Gin 

Leu 

Leu 

Lys 

Asp 
325 

Val 

Glu 

Thr 

Ala 

Val 
330 

Glu 

Asp 

Ser 

Leu 

Tyr 
335 

Trp 

Glu 

Cys 

Arg 

Ala 
340 

Ser 

Gly 

Lys 

Pro 

Lys 
345 

Pro 

Ser 

Tyr 

Arg 

Trp 
350 

Leu 

Lys 

Asn 

Gly 

Asp 

Ala 

Leu 

Val 

Leu 

Glu 

Glu 

Arg 

He 

Gin 

He 

Glu 

Asn 

Gly 


355 





360 





365 
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Ala 

Leu 

Thr 

He 

Ala 

Asn 

Leu 

Asn 

Val 

Ser 

Asp 

Ser 

Gly 

Met 

Phe 

Gin 

370 





375 





380 





Cys 
385 

He 

Ala 

Glu 

Asn 

Lys 

His 

Gly 

Leu 

He 

Tyr 

Ser 

Ser 

Ala 

Glu 

Leu 





390 





395 





400 

Lys 

Val 

Leu 

Ala 

Ser 

Ala 

Pro 

Asp 

Phe 

Ser 

Arg 

Asn 

Pro 

Met 

Lys 

Lys 




405 





410 





415 


Met 

He 

Gin 

Val 

Gin 

Val 

Gly 

Ser 

Leu 

Val 

He 

Leu 

Asp 

Cys 

Lys 

Pro 



420 




425 





430 



Ser 

Ala 

Ser 

Pro 

Arg 

Ala 

Leu 

Ser 

Phe 

Trp 

Lys 

Lys 

Gly 

Asp 

Thr 

Val 


435 




440 





445 




Val 

Arg 
450 

Glu 

Gin 

Ala 

Arg 

He 
455 

Ser 

Leu 

Leu 

Asn 

Asp 
460 

Gly 

Gly 

Leu 

Lys 

He 

Met 

Asn 

val 

Thr 

Lys 

Ala 

Asp 

Ala 

Gly 

He 

Tyr 

Thr 

Cys 

He 

Ala 

465 





470 





475 





480 

Glu 

Asn 

Gin 

Phe 

Gly 

Lys 

Ala 

Asn 

Gly 

Thr 

Thr 

Gin 

Leu 

Val 

Val 

Thr 




485 





490 





495 


Glu 

Pro 

Thr 

Arg 

He 

He 

Leu 

Ala 

Pro 

Ser 

Asn 

Met 

Asp 

Val 

Ala 

Val 


500 





505 





510 



Gly 

Glu 

Ser 

He 

He 

Leu 

Pro 

Cys 

Gin 

Val 

Gin 

His 

Asp 

Pro 

Leu 

Leu 


515 





520 





525 




Asp 

He 

Met 

Phe 

Ala 

Trp 

Tyr 

Phe 

Asn 

Gly 

Thr 

Leu 

Thr 

Asp 

Phe 

Lys 

530 





535 





540 



Gly 


Lys 

Asp Gly 

Ser 

His 

Phe 

Glu 

Lys 

Val 

Gly 

Gly 

Ser 

Ser 

Ser 

Asp 

545 





550 





555 





560 

Leu 

Met 

He 

Arg 

Asn 

He 

Gin 

Leu 

Lys 

His 

Ser 

Gly 

Lys 

Tyr 

Val 

Cys 




565 





570 





575 


Met 

Val 

Gin 

Thr 

Gly 

Val 

Asp 

Ser 

Val 

Ser 

Ser 

Ala 

Ala 

Glu 

Leu 

He 




580 




585 





590 




Val Arg Gly Ser 
595 


(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 630 amino acids 

(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 


Met 

Val 

Leu 

His 

Ser 

His 

Gin 

Leu 

Thr 

Tyr 

Ala 

Gly 

He 

Ala 

Phe 

Ala 

1 




5 





10 





15 


Leu 

Cys 

Leu 

His 

His 

Leu 

He 

Ser 

Ala 

He 

Glu 

Val 

Pro 

Leu 

Asp 

Ser 



20 





25 





30 



Asn 

He 

Gin 
35 

Ser 

Glu 

Leu 

Pro 

Gin 
40 

Pro 

Pro 

Thr 

He 

Thr 
45 

Lys 

Gin 

Ser 

Val 

Lys 

Asp 

Tyr 

He 

Val 

Asp 

Pro 

Arg 

Asp 

Asn 

He 

Phe 

He 

Glu 

Cys 


50 



55 





60 




Gly 

Glu 

Ala 

Lys 

Gly 

Asn 

Pro 

Val 

Pro 

Thr 

phe 

Ser 

Trp 

Thr 

Arg 

Asn 

65 



70 





75 





80 

Lys 

Phe 

Phe 

Asn 

Val 

Ala 

Lys 

Asp 

Pro 

Lys 

Val 

Ser 

Met 

Arg 

Arg 

Arg 




85 




90 





95 


Ser 

Gly 

Thr 

Leu 

Val 

He 

Asp 

Phe 

His 

Gly 

Gly 

Gly 

Arg 

Pro 

Asp 

Asp 



100 





105 





110 



Tyr 

Glu 

Gly 

Glu 

Tyr 

Gin 

Cys 

Phe 

Ala 

Arg 

Asn 

Asp 

Tyr 

Gly 

Thr 

Ala 


115 




120 





125 




Leu 

Ser 

Ser 

Lys 

He 

His 

Leu 

Gin 

Val 

Ser 

Arg 

Ser 

Pro 

Leu 

Trp 

Pro 


130 




135 





140 





Lys 
145 

Glu 

Lys 

Val 

Asp 

Val 
150 

He 

Glu 

Val 

Asp 

Glu 
155 

Gly 

Ala 

Pro 

Leu 

Ser 
160 

Leu 

Gin 

Cys 

Asn 

Pro 

Pro 

Pro 

Gly 

Leu 

Pro 

Pro 

Pro 

Val 

He 

Phe 

Trp 




165 





170 





175 


Met 

Ser 

Ser 

Ser 

Met 

Glu 

Pro 

He 

His 

Gin 

Asp 

Lys 

Arg 

Val 

Ser 

Gin 


180 





185 





190 



Gly 

Gin 

Asn 

Gly 

Asp 

Leu 

Tyr 

Phe 

Ser 

Asn 

Val 

Met 

Leu 

Gin 

Asp 

Ala 


195 




200 





205 
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Gin Thr Asp Tyr Ser Cys Asn Ala Arg Phe His Phe Thr His Thr He 

210 215 220 

Gin Gin Lys Asn Pro Tyr Thr Leu Lys Val Lys Thr Lys Lys Pro His 
225 230 235 240 

Asn Glu Thr Ser Leu Arg Asn His Thr Asp Met Tyr Ser Ala Arg Gly 

245 250 255 

Val Thr Glu Thr Thr Pro Ser Phe Met Tyr Pro Tyr Gly Thr Ser Ser 

260 265 270 

Ser Gin Met Val Leu Arg Gly Val Asp Leu Leu Leu Glu Cys He Ala 

275 280 285 

Ser Gly Val Pro Ala Pro Asp He Met Trp Tyr Lys Lys Gly Gly Glu 

290 295 300 

Leu Pro Ala Gly Lys Thr Lys Leu Glu Asn Phe Asn Lys Ala Leu Arg 
305 310 315 320 

He Ser Asn Val Ser Glu Glu Asp Ser Gly Glu Tyr Phe Cys Leu Ala 

325 330 335 

Ser Asn Lys Met Gly Ser He Arg His Thr He Ser Val Arg Val Lys 

340 345 350 

Ala Ala Pro Tyr Trp Leu Asp Glu Pro Gin Asn Leu He Leu Ala Pro 

355 360 365 

Gly Glu Asp Gly Arg Leu Val Cys Arg Ala Asn Gly Asn Pro Lys Pro 

370 375 380 

Ser He Gin Trp Leu Val Asn Gly Glu Pro He Glu Gly Ser Pro Pro 
385 390 395 400 

Asn Pro Ser Arg Glu Val Ala Gly Asp Thr He Val Phe Arg Asp Thr 

405 410 415 

Gin He Gly Ser Ser Ala Val Tyr Gin Cys Asn Ala Ser Asn Glu His 

420 425 430 

Gly Tyr Leu Leu Ala Asn Ala Phe Val Ser Val Leu Asp Val Pro Pro 

435 440 445 

Arg He Leu Ala Pro Arg Asn Gin Leu He Lys Val He Gin Tyr Asn 

450 455 460 

Arg Thr Arg Leu Asp Cys Pro Phe Phe Gly Ser Pro He Pro Thr Leu 
465 470 475 480 

Arg Trp Phe Lys Asn Gly Gin Gly Asn Met Leu Asp Gly Gly Asn Tyr 

485 490 495 

Lys Ala His Glu Asn Gly Ser Leu Glu Met Ser Met Ala Arg Lys Glu 

500 505 510 

Asp Gin Gly He Tyr Thr Cys Val Ala Thr Asn He Leu Gly Lys Val 

515 520 525 

Glu Ala Gin Val Arg Leu Glu Val Lys Asp Pro Thr Arg He Val Arg 

530 535 540 

Gly Pro Glu Asp Gin Val Val Lys Arg Gly Ser Met Pro Arg Leu His 
545 550 555 560 

Cys Arg Val Lys His Asp Pro Thr Leu Lys Leu Thr Val Thr Trp Leu 

565 570 575 

Lys Asp Asp Ala Pro Leu Tyr He Gly Asn Arg Met Lys Lys Glu Asp 

580 585 590 

Asp Gly Leu Thr He Tyr Gly Val Ala Glu Lys Asp Gin Gly Asp Tyr 

595 600 605 

Thr Cys Val Ala Ser Thr Glu Leu Asp Lys Asp Ser Ala Lys Ala Tyr 

610 615 620 

Leu Thr Val Leu Ala He 
625 630 
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What is claimed is: 

1. A .method for identifying a cDNA nucleic acid 
encoding a mammalian protein having a signal sequence, 
the method comprising: 
5 a) providing library of mammalian cDNA; 

b) ligating said library of mammalian cDNA to DNA 
encoding alkaline phosphatase lacking both a signal 
sequence and a membrane anchor sequence to form ligated 
DNA; 

10 c) transforming bacterial cells with said ligated 

DNA to create a bacterial cell clone library; 

d) isolating DNA comprising said mammalian cDNA 
from at least one clone in said bacterial cell clone 
library; 

15 e) separately transfecting DNA isolated from 

clones in step (d) into mammalian cells which do not 
express alkaline phosphatase to create a mammalian cell 
clone library wherein each clone in said mammalian cell 
clone library corresponds to a clone in said bacterial 

20 cell clone library; 

f) identifying a clone in said mammalian cell 
clone library which express alkaline phosphatase; 

g) identifying the clone in said bacterial cell 
clone library corresponding to said clone in said 

25 mammalian cell clone library identified in step (f ) ; and 

h) isolating and sequencing a portion of the 
mammalian cDNA present in said bacterial cell library 
clone identified in step (g) to identify a mammalian cDNA 
encoding a mammalian protein having a signal sequence. 

30 2. The method of claim 1 wherein said library of 

mammalian cDNAs are ligated to ptrAP3 . 
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3. The method of claim 1 wherein said mammalian 
cells are COS7 cells. 

4 . The method of claim 1 wherein said bacterial 
cells are E . coli . 

5 5. The expression vector ptrAP3 . 

6. The expression vector of claim 5, comprising 
the sequence of SEQ ID NO:l. 

7. The protein of SEQ ID NO : 5 . 

8. An isolated nucleic acid sequence encoding the 
10 amino acid sequence of SEQ ID NO: 5. 

9. A vector comprising the nucleic acid sequence 
of claim 8 . 

10. The vector of claim 9 wherein said vector is 
an expression vector. 

15 11. A genetically engineered host cell comprising 

the nucleic acid sequence of claim 5. 
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ptrAP3 



stuffer 


FIG. 1 
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ptrAP3 vector sequence 


AAGCTTGGCTGTGGAATGTGTGTCAGTTAGGGTGTGGAAAGTCCCCAGGCTCCCCAGCAGGCAGAAGTATGC 

AAAGCATGCATCTCAATTAGTCAGCAACCAGGTGTGGAAAGTCCCCAGGCTCCCCAGCAGGCAGAAGTATGC 

AAAGCATGCATCTCAATTAGTCAGCAACCATAGTCCbGCCCCTAACTCCGCCCATCCCGCCCCTAACTCCGC 

CCAGTTCCGCCCATTCTCCGCCCCATGGCTGACTAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCGG 

CCTCTOAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAAAGCTCCTCC^^ 

CGAGGGGCTCGCATCTCTCCTTCACGCGCCCGCCGCCCTACCTGAGGCCGCCATCCACGCCGGTTGAGTCGC 

GTTCTGCCGCCTCCCGCCTGTGGTGCCTCCTGAACTGCGTCCGCCGTCTAGGTAAGTTTAAAGCTCAGGTCG 

AGACCGGGCCTTTGTCCGGCGCTCCCTTGGAGCCTACCTAGACTCAGCCGGCTCTCCACGCTTTGCCT^ 

CTGCTTGCTCAACTCTACGTCTTTGTTTCGTTTTCTGTTCTGCGCC^ 

AGAAAGTTAACTGGTAAGTTTAGTCTTTTTGTCTTTTATTTCAG^ 

ATCTAAGAACTGCTCCTCAGTGAGTGTTGCCTTTACTTCTAGGCCTGTACGGAAGTGTO 

AAGCTGC GQXATTCQCXCCXCCOTAQTTTTTXCQCCC QqTQXQCaCTCCACCCQCXCCTXeX 
XQCQCaTqTXTqXTqXQOTQTXCqQCQXCQXQqXCCy QCTTOXOCXQOCCXXCOXQCOCCT 

cqQqqxaT'rTqccfrxcqqxxxqcqqcxTXxqQXCXTq TTqqcQTTqccqcTQQXcqxqqqc 
xxgccxxcxccTxqccTxxxqcccqTqxcxcTqcxqcxqqTqcTqcccxcqcTTocxccoy 
gcqxxqxxxxqcqcqqccTXXxQcqcqxq'rgTqqTqxcTTqqcxcccxccqTqcxqcTqxT 
qqTxcccxxqcqccxqcqxcTqqxxqxTqTCTTqqxxxxxxTqxccqTqqxqccTqqqcTq 
' qxqcccqxqqTccqcqTGCGqccxxTcxxqcxqqTqqcxccqqqxcTqqqcqTqcxqxccq 
TqqxcqTTCxqi^TxcccxccxccxqTxqcxcTxqT XTTqccxcTqccxcxqxqqqcxTqqx 

OACACAXACQTgCgCOQTTQCCTAQCTCOAqATCArCCCA'f?TTGAGGAGgAGAACCCgGACrTCTG 

(^ACAAArrOGrinrrTGAGATACCCCTGGCCATGGACCarTrCCrATATGTGGCTCT^ 

TGTAGArAAACATGTGCCAGACAGTGGAnCCACAaCCArr^rCTACCTGTGCGOCXITCAACX^ 

GACCA TTnnCTTGAGTGCAGCCGCCCaCT'rTAACCAGTnrAArA rGACACGCGGCAACGAGaTTA TTTrrnT 

GATaAATCCXX;CCAAaAAAGCAGC^AAaTCAGTG(^,AGTr^,TAAnrACCACArnAGTarArj'^ 

AGCCGnrACCTArnrrCACACGGTGAACrGC'AArTGG'rArTrmACGCCGACGTGCCTGCrTrftr^ 

GaAa(^^TGCCAru7ArATCGrTArGrAr^TrATrTrrJ^j^rATruiArATTGAcaTGATCC^^ 


FIG. 2 
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^r,^^^^ rj,r^r^,rrr.AnArTr r-^r. ^^r^n^,ArrrrTrrrTnA^^^^ 
r.^ ^^,^^rrrrcacc^-r'r^rTTrrTrT T rn'vr^,Ar^^,T^^^ 
^^ ^ r-nr.r:nrArTnArr r -^rzAnr,ATrArr,rrrGArG^CGC^^ 
^^^^^ rArnrTnAnrCTC r -'rnAr'rnrrnArCArrrrrACG^^^^ 

^^r-^r-.nrTATGTGrTr-^^nnArGGc q rrrGGrrGGATGrTArrGAGAC^^^^ 
r-.r.^nrAnTrAGCAG'rr.rrrrTGGAC G ^'^nAnArrrACGrAGGCGAGGfiCnTr^CG^^^ 

r-r-^ ^.nrnrArrTGGTT r- ^rnnrn'TGrAGGAGrAGArrTrCAT^^^^^ 
r.r-^^ r.r-nn^.rArrGrr T --nn.Anr'rr^,rGrrrrrrGCr^^^^^ 

TCTAGAGAAAAAACCTCCCACACCTCCCCCTGAACCTGAAACATAAAATGAATGCAATTGTTGTTGTTAACT 

TGTTTATTGCAGCTTATAATGGTTACAAATAAAGCAATAGCATCACAAATTTCACAAATAAAGCATTTTTTT 

CACTGCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCTGGATCCCCGGGTACCGAG 

CTCGAATTAATTCCTCTTCCGCTTCCtCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGG 

TATCAGCTCAGTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAG 

CAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCC 

CTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGG 

CGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCT 

TTCTCCCTTCGGGAAGCGTGGCGCTTTCTCAATGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTC 

GCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTC 

TTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGA 

GGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGTATTTG 

GTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCA 

CCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATC 

CTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGAT. 

TATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATG 

AGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTT 

CATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTG 

CTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGG 

CCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAG 

TAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGT 

CGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCA 

AAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGG 
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TTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACT 

CAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATAATA 

CCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCA^ 

TCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCTTCAGCATCTTTTACTT 

TCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAG 

AATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCG 

GATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCAC 

CTGC Csetz. Ajc^:^ 

1^ 
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FIG. 3 

MT.LL^LLLLGLRLOLSLG IIPVEEENPDFWNREAAEALGAAKKLQPAQTAAKNLI 
I FLGDGMGVS TVT AARI LKGQKKDKLG P E I PLAMDRF P YVAL S KT YNVDKHVPD 
SGATATAYLCGVKGNFQTIGLSAAARFNQCNTTRGNEVISVMNRAKKAGKSVGV 
VTTTRVQHASPAGTYAHTVNRNWYSDADVPASARQEGCQDIATQLISNMDIDVI 
LGGGRKYMFRMGTPDPEYPDDYSQGGTRLDGKNLVQEWLAKRQGARYVWNRTEL 
MQASLDPSVTHLMGLFEPGDMKYEIHRDSTLDPSLMEMTEAALRLLSRNPRGFF 
LFVEGGRIDHGHHESRAYRALTETIMFDDAIERAGQLTSEEDTLSLVTADHSHV 
FSFGGYPLRGSSIFGLAPGKARDRKAYTVLLYGNGPGYVLKDGARPDVTESESG 
SPEYRQQSAVPLDEETHAGEDVAVFARGPQAHLVHGVQEQTFIAHVMAFAACLE 
P YTACDLAP PAGTTnAAHPG RSWPAIiLPLLAGTLI.LriETATAP 

FIG. A 

1 1 PVEEENPDFWNREAAEALGAAKKLQPAQTAAKNL I IFLGDGMGVSTVTAARI 
LKGQKKDKLGPEIPLAMDRFPYVALSKTYNVDKHVPDSGATATAYLCGVKGNFQ 
T IGL S AAARFNQCNTTRGNEVI S VMNRAKKAGKSVGWTTTRVQHAS PAGTYAH 
TVNKNWYSDADVPASARQEGCQDIATQLISNMDIDVILGGGRKYMFRMGTPDPE 
YPDDYSQGGTRLDGKNLVQEWLAKRQGARYVWNRTELMQASLDPSVTHLMGLFE 
PGDMKYEIHRDSTLDPSLMEMTEAALRLLSRNPRGFFLFVEGGRIDHGHHESRA 
YRALTETIMFDDAIERAGQLTSEEDTLSLVTADHSHVFSFGGYPLRGSSIFGLA 
PGKARDRKAYTVLLYGNGPGYVLKDGARPDVTESESGSPEYRQQSAVPLDEETH 
AGEDVAVFARGPQAHLVHGVQEQTFIAHVMAFAACLEPYTACDLAPPAGTTDAA 
HPG (^SS? iao:i) 
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MWLVTFLLLLDSLHK 15 

JJATAQCXKICTCTTTATOGC ATG TOG CTC CTTA ACT TTC CTC CTG CTC CTG GAC TCT TTA CAC AAA 143 

ARPEDVGTSLYFVNDSLQQV 35 

GCC CGC CCT GAA GAT GTT GGC ACC AGC CTC TAC TIT OTA AAT GAC TCC TTG CAG CAC C3TG 203 

TPS s SVCV VVPCPAAGSPSA 55 

ACC TTT TCC AGC TCC GTG GOG GTO GTG GTG CCC TGC CCG GCC GCG GCC TCC CCC AGC GCG 263 

ALRWYLATGODIYDVPHIRH 75 

GCC CTT CGA TGC TAC CTG GCC ACA GGG GAC GAC ATC TAC GAC GTG CCG CAC ATC COG CAC 323 

V/KANGTLQLYPFSPSAFNSF 95 

GTC CAC GCC AAC GGG ACC CTG CAG CTC TAC CCC TTC TCC CCC TCC GCC TTC AAT AGC TTT 3B3 

IHDNDYFCTAENAAGKIRSP 115 

ATC CAC GAC AAT GAC TAC TTC TOC ACC GCG GAG AAC GCT GCC GGC AAG ATC CGG AGC CCC 443 

NIRVKAVFREPYTVRVEDQR 125 

AAC ATC CGC GTC AAA GCA GTT . TTC AGG GAA CCC TAC ACC GTC COG GTG GAG GAT CAA AGG 5D3 

SMRGMVAVrKCLIPSSVQEY 155 

TCA ATG CGT GGC AAC GTG GCC GTC TTC AAG TGC CTC ATC CCC TCT TCA GTG CAG GAA TAT 563 

vsvvsw£k::tvsii?enrff 175 

GTT AGC G~ OTA TCT TGG GAG AAA GAC ACA GTC TCC ATC ATC CCA GAA AAC AGG TTT TTT 623 

I7YHGGIjYISDVQKEDALST 195 

ATT ACC TAC CAC GGC GGG CTG TAC ATC TCT GAC GTA CAG AAG GAG GAC GCC CTC TCC ACC 683 

YRCITKHKYSGZTRQSNGAR 215 

TAT CGC TGC ATC ACC PAQ CAC AAG TAT AGC GGG GAG ACC CGG CAG AGC AAT GGG OCA CCC 7 43 

LSVTDPA2SIPTILDGFKSQ 235 

CTC TCT GTG ACA GAC CCT GCT GAG TCG ATC CCC ACC ATC CTG GAT GGC TTC CAC TCC CAG fl33 

EVWAGKTV£LPCTASGY?I? 255 

GAA CTG TGG GCC GGC CAC ACC GTG GAG CTG CCC TGC ACC GCC TCG GGC TAC CCT ATC CCC S63 

a1RVJLKDGR?L?ADSRWTKR 275 

GCC ATC CGC TCC CTC AAG GAT GGC CGG CCC CTC CCG GCT GAC AGC CGC TGG ACC AAG CGC 92 3 

itg-tisdlrtei:sgtytce 235 

ATC ACA GGG CTG ACC ATC AGC GAC TTG CGG ACC GAG GAC AGC GGC ACC T?.C ATT TGT GAG 983 

VTNTFGSAEAT '"^G ILMVIDPL 315 

GTC ACC AAC ACC CTC GGT TCG GCA GAG GCC .^CA GGC ATC CTC ATG GTC ATT GAT CCC CTT 1043 

HVTLTPKKLKTGIGSTVILS 335 

CAT GTG ACC CTG ACA CCA AAG AAG CTG AAG ACC GGC ATT GGC AGC ACG GTC ATC CTC TCC IIC^ 

CALTG SPEFT. IRWYRNTS:,V 355 

TGT GCC CTG ACG C3GC TCC CCA GAG TTC ACC ATC CGC TGC TAT CGC AAC ACG GAG CTG GTG 1153 

LPDEAISIRGLSNETLLZTS 375 

CTG CCT GAC GAG GCC ATC TCC ATC CGT GGG CTC AGC AAC GAG ACG CTG CTC ATC ACC TCG 12 23 

AQKSHSGAYQCFATRKAQTA 595 

GCC CAG AAC AGC CAT TCC GOG GCC TAC CAG TGC TTC GCT ACC CGC AAG GCC CAG ACC GCC 12 63 
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QDFAIIALEDGTPRIVSSFS 415 

CAC CAC TTT* GCC ATC ATT CSCA CTT CiAG GAT GGC ACG CCC CGC ATC GTC TCG TCC TTC AGC 1343 

EKVVNPGEQFSLMCAAKGAP 43S 

GAG AAC GTC Cnt: AAC CCC GGC C5AG CAG TTC TCA CTC ATO 1403 

pp ^^VTWA Z-CDEPrVRDGSHR 455 

CCC CCC ACG GTC.ACC TCG GCC CTC GAC GAT GAG CCC ATC GTC CGG GAT GGC AGC CAC CGC 1463 

TNQYTMSDG? K , ^ ,^93 

ACC AAC CAG TAC ACC ATC TCG GAC 3GC ACC 'S-^ A/l7:r) ^^^^ 

\ 
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8 £3 6 MWIiVrrUOJuDSLKKARPED VCTSLYFVNOSLQQVTPSSS 

D3 8 492 — MICT7LLVSKLLLXSLTSCI/SSFTWHRJlYGHQVSSZinCCrQPXrSSQPXNTIYPESS^ ^ 

P20241EUIIO MWRQSTILAAliLVALI*CAaEABSKCanLPPRlTK QPAPGILLnCVAQQNICJBSD ^ 

P32004KnRA >«WAlJlYVWPLIXCSPCrXiaiPEEyttCTH\nffi PPVITEQSPR-RLWFPTD . iz> 

P353 31G-CA -MMKXICSISXSKXSLVFPI*CQ«lSAIi>\n>U5SKLLEELS-QPPTXTM 

QQ2246XOin: -MQTATRJUCPHLLLVAAVALVSS5AWSSALG5QTT *FOPV7EDQPLSVLFPEBSTB . <t 

U11031 — MLSWKQLXLLSFIQCLAGELIiL Q aPVTVKCPSNSIFPVCSXD * i 

X65224 MVLHSHQLl^AGIAPAIiCIJJHLISAIEVPLDSJIIQSBLP-QPPTITXQSVK-DYIVDPlUD Y 


8 f 2 6 VCTVWPCPAAGSPSAALRWYtATGDDIYDVPHIRHVHAKG — TLQLYPFSPSAFNSFIHD 

D3 8 4 9 2 QKVSLNCRARAS PF PWKWRKN- NQDVDLTN - DRYSMV OaNIiVINNPDKQK- D — A 

P20241EUKO NPFIIECiaaWPEPEYSWIKN-GKKFDWOAYDNRMIiRQPG-ROTLVITIPfaaED — — R 

P3 2 0 0 4EURA D - 1 SLKCEASGKPEVQPKirriO-OVHrKPKMI-mmnrQS PHSGSFTXTCMNSNFAQR^ 

P3 5331C-CA N-XVTQCBAKCKPPPSFSWTRN-QTHTDIDKDAQVTKJCPN — SaTLWNXMNaVKAZAYE 

Q02246XONI SQVLIACRAWlSPPATYKWKKN-GTMaKPaSllHQLV aONr-VIKMPTKAQ-0--A 

U11031 KKITLNCEARGNPSPHYRWQLN-GSDIDTSUDHRYKLN OONLXVXNPNUKW-D— T 

X65224 ' N-IFIECEAICCNPVPTrSWTRN-GKFriJVAJCDPlCVSt«»RR--SOTLVXDPHOGGR 


8 f 2 6 NDTPCTAENA^GKXRSPNIRVKAVFREPYTVRVEDQRSMR-aNVAVFKCLIPSSVQE^ 

D3 8492 GXTYCIAS^JKYaMVRSTEATLSFGYI-OPFPPmRPEVlCVKZaKGM^nLLCD 

P20241SURO GHYQCFASNtFGTATSNS\rrVRXAELNAFKDKAAlCTLEAVEQEPrMLKCAAPDOFPS — P 

P32004KURA CXTRCFASNKLGTAMSHEXRLMAKSAPKVfPICrTVKPVrVXZaZSVVLPCOT — L 

P35331C-CA CVrgCTARNXROAAI SNNI VI RP SRS P LWTXKKLEPNKVREGDSI^VLNCaiP PVOL P P - - P 

002 2 4 6XONI CVTQCLASNPVQTWSRfiAIUlFGFLQrr SKSrERDPVXAOTOWOVMLPCNPP — L 

U11031 QSTQCFATNSLaTXVSRttAiCLQrAYLINFKSRMRSRVSVRXOQCWl-LCGPPPHSCX--L 

X6522'4 aEnrQCFARNDYOTALSSKIHl^VSRSPLWPKEKVtJVIEVDEQAPLSLQCNPPPGLPP--P 

8 f 2 6 WSWrXDTVS 1 1 PS NR — rFITYHaOLYI SDVQKSD- - XLSTTRCXTKHKYSGET 

D3 8 4 9 2 S YKKLUrar PVT ITM DKRRJVSQ-TNGNL YIANVESSD RGKTSCFVfiS - - PSIT 

P20241EURO TVNWMIQESXDGSIKSINNSR — MTLDPEONLWFSNVTREDASSDFYTACSATSVFRSKY 

P3 2 0 0 4EUBA RXY1«08aKXLHIKQ DER — VTMOQNCNLY7AWLTSDN — HSDTICKAHFPGTRTI 

P3S331G-CA IXrWKDNAFQRLPQ SER--VSQaLNaDLYFSNVQPEDT — RVDTXCYARFNHTQTI 

Q0224 6XONI SYRKLLNIFPNFIPT DORHFVSQ-TTGNLYIARTNASD liGNTSCIATSHMDPST 

Ul 1 0 3 1 S YAKVFNEYPSFVEE DSRRF VSQ - ETGHLiYI AKVEP SD VGNTTCWTS - -TVTN 

X6522 4 VXrWMSSSMEPIHQ DKR — VSQCQNGDLYTSITVMLQDA — QTDYSCNARFHFTHTI 

8f26 RQSNOARLSVTDPAES IPTlLtXSFHSQEV WAGHTVEL 

D3 8492 KSVTSKFIPLXPIPERTT KPYPADXWQFXDXY — TMMGQNVTL 

P20241EURO XXaNKVLLDVKQMGVSASQ NKHPPVRQYVSRRQS-UOiRQKRKKI- 

P32004EURA IQKEPIDLRVKATNSMXD RKPRIXFPTNSSSHLVALQGQPLVL 

P3 S3 3 IQ-CA QQKQPISVXVFSTK? VTERPPVLLTPMGSTSNKVEIiRGNVlXX- 

QO 2 2 4 6XONI KSVFSKFAQLNIAAEDTR LF AP S X KARFP ASTY — ALVGQQVTL 

Ul 10 3 1 ARVLGSPTPLVLRSDGVMC EYEPKIELQFPETLP - - AAKGSTVja 

X65224 QQKNPYTUC^nCTKKPHNETSIJ<^mDMYSARGVTETC 

• ♦ « 

8f26 PCTASGYPIPAIRX«rLKDGRP — LPABSRWTKRITGLTISDLRTEDSGTTXCEVTNTFCSSX 

03 8492 ECFALGNPVPDIRWRKVLZP — MPTTAEISTSGAVLKIFNIQLEDEGIiTECEAENXRGKD 

P202 41EURO FCIYGCTPLPQTWSKIXSQRXQWSDRITQCHYaKSLVIRQTNFDDAG'i'S'i^CDirSMGVGKA 

P3 2 0 0 4B:URA ECXAEGFPTPTXKWLRPSGPM- PADRVTYQNHNKTLQl.LXVaEEDDGKTRCLAENSLGSA 

P3 53 3 IQ-CA ECXAAGLPTPVIRHIKEGGEL- PAMRTFFENFKKTLKIIDVSEADSCNYKCTARNTLCST 
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Q02246XONI 

U11031 

X65334 


ECFAFONPVPRlKHlirVDa SLSPQWTTAErrLQIPSVSrKDBarrECSASNSKORD 

TCFAIiGNPVPQIOTniRSDaMP-FPTKIKIJUCFNCWLKPNrQQB 
ECXASCVPJUroXMWriaCOaEL-PAGKTKLENrNKAUlISW^ 


gfjg E-ATGILKVIDPUTVTLTPKKIiKTGICSTVILSCAl-TCSPErriRIfy^^ • 

Q38492 K-HOAJlIWQArPEW\^INOTEVDlGSDLYMPCVATaKPIPTIKMtJCMO 

P20241EURO QsrSIII-NVNSVPYrTKEPEIATAAEDEE\n^EClUkAaVPEPKlSlflHNOKP^ 

P 3 2 0 0 4EUIIA R-HAYYVTVEAAPVWUOCPQSHLYCPOPrARLDCQVQORPO PEVTW^ PVSEIAKDQ 

P3S33ia-CA H-HVaSVTVKAAPyWITAPRNLVLSPQElXyrLICRAiraiPKPSlSmiTNCS^ 

Q0224 6XONI T-VQCRIXVQAaP^WLiG^SOTEADIaSNLRWCKAAA£aXPllPT^m•iI^^ - 

Ui 1 0 3 1 V- ARClUuTYYAKPYWVQLLKDVrrAVEDSLYWaC3LASaKPKPSYRK^^ — 

X6S224 R-HTlSVRVXAAPYWLDEPONLllAPaEDGRL.VCRANaNPKPSIQWLVNaEPlEGSPPNP 

* • * « * 

gf26 a LVLPDEAISIKGLSN 

□3 3492 -yAYHKaELRLYTJVTPENAGMVQCIAENAYGTIYANAEIiKILAlJlPTFEMNPMKXKII^ 

P 2 0 2 4 1 EURO RRTVTDNTIRI INLVXGDTCNYOCNATNS LGYVYKDVYLNVOASP P - - TI SttAP AAVSTV 

P32004EURA KYRIQRaALlLSNVQPSDTH\nrQCEARNRHCl^LANAYrfVVQLPA-KILTAnKQ^^ 

P3 533ia-CA SRKVIx;DTIIFSAVQERSSAWQCNASNEYGyi4*ANArVirVT-AEPP-RILTPANKI*Ya^ 

Q 0 2 2 4 6XON I - VEVIAGDLRF SKL S LEDSGMYQCVAENKHGT XYAS AELAVQAIAPDFRLNPVRRL I PAA 

U11031 -iQijMGALTIANlJ^VSDSC3MFQCIAENKHaLIYSSAELKVI^SAP0FSRNPMXKMiaf\^ 

X6522 4 SREVAGDTIVFROTgiGSSAVYOCNASNKHaYIitANATVSVLDVPP-RIL^ 

aj26 ETLLITSAQKSHSQArOCPA 

038492 KGGRVI I ECKPKAAPKPKFSWSKGTim-VNSSRI LlWKD-aSLEimiTRNTOOlT^ 

P202 41EURO DQRNVTIKCRVNQSPKPLVKWLRASNWLT — GORYTJVQANQDLSIQPVTr spALt KYTCYA 

P32004EURA QQSTAYLLCXArG?J?VPSVQVnjDETCTTVI.QDERFrPYANQTLaiRDL0ANDTOR^^ 
P353 31G-CA ADSPAZ-IDCAYFGSPKPEIEWFRGVKGSILRGOTYVFHDOTTI-EIPVAQiar'STGTY^^ 
Q02246XON1 ROGIILIPCQPRAAPKAWLWSKGTSILVNSSRVTVTPD-GTLIIRKISRSDECICYTCPA 
U1103 1 vaSLVILDCKPSASPRJO-SFVnCXCDTVVREQARlSLI-OT-GGrjaMNV^^ 
X65224 QYNRTRIJXrPFFCSPIPTLRWFIQICQaNMLDGCNYXAHENGSLEMSMARKEDQGIT^ 

9f2 6 TRKAQTAQDFAlIALEIX3TPRIVSSFSEKVVNPCEgFSI21CAAKCAP--PrTVT^^ 
038492 ENNRCKANSTGTLVITNPT -RI 1 LAPINADITVCENATMQCAASFDPSUDLTFVWSrKGY 

P202 41EURO QjTOCEIQADOSLVVKKHT-RITQEPQNYEVAACQSATrRCaJEAHDDTLEIEIDWWXDGQ 
P3 2004EURA ArOQNNVTIMANLKVrnAT-QITQGPRSTIEKKCSRVTrTO^ 

P3 5 331G-CA RNlCLGKTQNEV0LE\nCDPT-MllKQPQY7CVXQRSAOASFECVIKHDPTLIPTVIWJ^ 
Q02246XONI ENFMGJUlNSTGII.S\miXAT-KlT]:J^PSSADINLGnNLTLQCHASHDPTOTLTrT« 
U 1 1 0 3 1 ENQFGKANGTTQLWT EPT-RIILAP SNMDVAVGES 1 1 LPCQVQHDP LLDIKFAWYFNGT 

X65224 TNIIXSlOrEAQVRLEVraPT-RIVRGPEDQVVKRGSMPRLHCRVKHDPTLKLt^ - 


8f26 prVRDGSHRTNQYTMS "J? 

038492 viDFNKEITNIHYQRKrMLDANGELLIRNAjQLKHAGRYTCTAQTIVDNSSASADLVVRaP e 

P20241EURO SIDFEAQPR rVKTNDN--SLTIAXTKELDSCETTCVARTRl-DEATARANLZVQDV C 

P32004EURA .-DLQELGD--.SDKYFIEIX3--RLVIHSLDYSIXy:nrrSCVASTELDWESRAQLL\ATO C 

P353 31G-CA NNELPDD ERFLVGKD--NLTIMNVTDW3DGTTTCIVNTTLDSVSASAVLTVVAA C- 

Q02246XONI piDFDKPGO--HYRRTTJVKj:TlGDLTILNAQLRHOaKTTCMAQT\AroSASK C 

U11031 LTI3FKKIX3S--HrEKTOGSSS-GDIJWIRNIOLKKSGKXra5VQTG\ro^^ C 

Xfi5224 --DAPLYIG NRMKKZDD--GLTIYGVAElCDaCDyTCVASTSLDKDSAXAYLTVLAl C 


FIG. 6 


SDOCID: <WO 9822491 A 1_L> 


INTERNATIONAL SEARCH REPORT 


International application No. 
PCT/US97/20201 


A. CLASSIFICATION OF SUBJECT MATTER 

1PC(6) :C07H 21/04; C07K 14/47; C12N 5/16, 15/70, 15/79; C12Q 1/68 
US CL :435/6, 320.1, 325; 530/350; 536/23.5 
According to International Patent Classification (IPC) or to both national classification and IPC 

B. FIELDS SEARCHED 

Minimum documentation searched (classification system followed by classification symbols) 

U.S. : 435/6, 172.3, 320.1, 325, 365; 530/350; 536/23.1, 23.5 ; 935/22, 24, 27, 79 

Documentation searched other than minimum documentation to the extent that such documents are included in the fields searched 


Electronic data base consulted during the international search (name of data base and, where practicable, search terms used) 
APS, STN (Biosis, CAPlus, LifeSci, Medline, INPADOC, WPIDS), Gcnbank, EMBL. Pir 


C. DOCUMENTS CONSIDERED TO BE RELEVANT 


Category* 


Citation of document, with indication, where appropriate, of the relevant passages 


us, 5,525,486 A (HONJO et al.) 11 June 1996, see entire 
document. 

US, 5,536,637 A (K. JACOBS) 16 July 1996, see entire document. 


Relevant to claim No. 


1, 3, 4 
1, 3, 4 


I I Further documents are listed in the continuation of Box C. | | See patent family annex. 


special categories of cited documents: 

"A* document defining the general state of the art which is not considered 

to be of particular relevance 

H* earlier document published on or after the international filing dale 

*L' document which may throw doubts on priority claim{t) or which is 

cited to establish the publication date of another citation or other 
special reason (as specified) 

O' document referring to an oral disclosure, use. exhibition or other 

means 

'P' document published prior to the inifmational Hling date but later than 

the priority date claimed 


later document published after the intcmatjonal filing date or priority 
date and not in conflict with the application but cited to understand 
the principle or theory underlying the invention 

document of particular relevance; the claimed invention cannot be 
considered novel or cannot be considered to involve an inventive step 
when the document is taken alone 

document of particular relevance; the claimed invention caimot be 
considered to involve an inventive step when the document is 
combined with one or more other such documents, such combination 
being obvious to b person skilled in the art 

document member of the same patent family 


Dale of the actual completion of the international search 
27 JANUARY 1998 

Date of mailing of the international search report 

3 FEB t99B ^ 

Name and mailing address of the ISA/US 
Commissioner of Patents and Trademarks 
Box PCX 

Washington, D.C. 20231 
Facsimile No. (703) 305-3230 

THOMaFg. LARSON, PH.D. '1 V-A^^ / 
Telephone No. (703) 308-0196 ( 


SDOCID: <WO 9a22491A1 I > 


;sccond sheet)(July 1992)^ 


